We obviously constantly need to edit files. To edit a file means to open it in some application/software that allow you to change the content of the file and save it.
The application you will want to use for editing a file depends on what data the file contains.
For example, a simple text editor (such as notepad on Windows) is often all you need to edit a text or a program.
But, if your file is an image, you may want to use GIMP, ImageJ or Photoshop (these are just examples).
In assignment 2, we suggest some ways to edit files, but any way can do.
I received a question on ‘edit and fix’. What do these function do, asked a student.
I advise you to use your prefered search engine to become autonomous and find answers by yourselves.
It may happen that a topic or a technical point has not been explained in the lectures, or has only been covered rapidly. This is due to the need to explain more important things. This is not the end of the world. Go and search information by yourself to find out the meaning of R functions.
For example, type ‘edit R’ in Google will return as first choice the R documentation on the edit function. You can also use Quora or help in forums, or even start your own blog on R programming!
You can, alternatively, use the R documentation inside Rstudio, without any need for a search engine. If you want to get help for function edit, just type ?edit in the console. The meaning of the function, description of arguments it needs, and an example of how to call it will be displayed in the downright window.
You wiill notice that when you try to knit your assignment, nothing comes out.
The cells of R code are not run !
This is because we use ‘eval = F’ (evaluation is false) as a prefix (first line) in cells of R code.
So you have to blank this expression and reknit your file. Then the cell will execute properly.
You should probably proceed in the following way.
To answer a question you create some code in a cell -leaving eval to false.
I am stupid
Note that this code does not make sense, but the file knits properly because we are not evaluating the cell.
If we replace r,eval=F by r, this would produce an error (Error in parse) during knitting. Try it.
So you use the console (just click on console in Rstudio), try your code until it is right, and when you are happy, paste your code back into the Rmd file and remove the ‘eval = F’ and knit it:
print("I am far from stupid")
## [1] "I am far from stupid"
You should answer all questions contained in the Rmd.
Sometimes, a question is asked in the text (unshaded background in Rstudio).
You can answer a question in your text by using the following syntax.
Suppose that the question is: How many fingers are there in most human hands?
You could type:
There are 5 fingers in a human hand.
You can type any R code and knitting your file will substitute the result of running the code.
I can produce two samples from the normal distribution, for example -0.6126964, 1.3951341
# the ToothGrowth dataframe does not need to be created.
# R has many named dataframes ready to use
# to get the list of these datasets, type:
# What are the names of columns in the ToothGrowth dataframe?
names(ToothGrowth)
## [1] "len" "supp" "dose"
The bookdown website is an interesting web site for R programmers (among loads of other sites).
You can produce scatterplots of several columns.
If you use the pairsfunction on 6 columns, this will result in a plot showing 6 x 6 rectangles, each containing a subplot for a given pair.
Type ?pairs to understand how scatterplot matrices work.
ifelse returns 1 for each item where a condition is satisfied or else returns 0
For example
if x is a dataframe with a colum named age
you can retrieve the value of all ages (this is a vector) via the expression:
x$age
You can build a vector of boolean value that will be true if age is more than 10 and false else, by the expression:
x$age > 10
Finally you can assign to a vector y the value 1 when the age is more than 10 and 0 else via the expression:
y=ifelse(x$age >10,1,0)
This should help you for the assignment
15 points total
Question 1 and 2 (3 points for reading in the data, adding row names, and dropping the first column)
Question 3 (1 point for summary output)
Question 4 (2 points for the table of Private vs Private2.)
Question 5 (1 point for the scatterplot)
Question 6 a.b.c.d (2 points, one for each 2x2 two way frequency table)
Question 6.e (3 points, 1 for the two way table of Elitec vs Elitef, 1 point for the summary outputs, and 1 point for the boxplots.)
Question 6.f (1 point for a correct explanation)
Question 7 (2 points for the correct number of colleges.)