This is the first article, of what will probably be many articles, pertaining to R-Software. I am assuming that you are familiar with R-Software, and that you have the software installed. Additionally, I am also assuming that you have RStudio, the IDE, also installed.
Once you have the R Console open, you will first want to set your working directory.
This can be achieved with the command:
setwd("<pathway of working directory>")
For example, you could create a designated folder on your Window's Desktop for such a directory, and make that folder your working directory. The code for such would resemble:
setwd("C:/Users/Name/Desktop/RWorkDirectory")
It is important to note that you will have to change the default "\" to "/", as R does not utilize the backslash in path directory listings.
The advantage for establishing a working directory, is that it allows for a certain level of convenience in importing, exporting, and saving data.
For example, if you were importing data without establishing a working directory, the code template for such would resemble:
Once you have the R Console open, you will first want to set your working directory.
This can be achieved with the command:
setwd("<pathway of working directory>")
For example, you could create a designated folder on your Window's Desktop for such a directory, and make that folder your working directory. The code for such would resemble:
setwd("C:/Users/Name/Desktop/RWorkDirectory")
It is important to note that you will have to change the default "\" to "/", as R does not utilize the backslash in path directory listings.
The advantage for establishing a working directory, is that it allows for a certain level of convenience in importing, exporting, and saving data.
For example, if you were importing data without establishing a working directory, the code template for such would resemble:
(Assuming that the file is a .csv)
DataFrameA <- read.table("C:/Users/Name/Desktop/RWorkDirectory/Filename.csv", fill = TRUE, header = TRUE, sep = "," )
DataFrameA <- read.table("C:/Users/Name/Desktop/RWorkDirectory/Filename.csv", fill = TRUE, header = TRUE, sep = "," )
or
(Assuming that the file is tab delineated)
DataFrameB <- read.table("C:/Users/Name/Desktop/RWorkDirectory/Filename.txt", fill = TRUE, header = TRUE, sep = "\t" )
If you had established the working directory, the code statement would be much shorter:
DataFrameA <- read.table("Filename.csv", fill = TRUE, header=TRUE, sep="," )
or
DataFrameA <- read.table("Filename.txt", fill = TRUE, header=TRUE, sep="\t" )
DataFrameA <- read.table("Filename.csv", fill = TRUE, header=TRUE, sep="," )
or
DataFrameA <- read.table("Filename.txt", fill = TRUE, header=TRUE, sep="\t" )
Import Options
Fill, Header, and Sep are optional statements, but typically their inclusion is necessary. Here is what each option enables:
Fill - This option notifies R that the variable observation data is of unequal length, and that some records will be missing observational data. In the case of missing data, 'N/A' values will be added if this option is enabled.
Header - This indicates to R, that the first row of data contains column names.
Sep - This indicates the type of delineation that separates each data observation. "," indicates a comma separated file, and "\t" indicates a tab delineated file. Additionally, if the data values are separated by some other exotic format, (ex. #, @, or |), you can indicate this as an import option, by listing it after sep =. Ex sep = "|".
Get Working Directory
If you ever forget where your work directory is located, you can always have it printed to the console by utilizing the command:
getwd()
In our example case, running the above command should output:
C:/Users/Name/Desktop/RWorkDirectory
In the next article, I will discuss how to check the integrity of newly imported data.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.