Microsoft Excel is the famous and popular spreadsheet delivered by Microsoft, and is perhaps the world's most popular tool for data analysis. Unfortunately, the format or extension XLS and XLSX - it is having, do not go very nice with other software, particularly outside any Windows environment. This means that some experimental demonstration may be required for finding a setup which works for your choice of operating system you are using and the particular type of Excel file.
There are varied ranges of software application that store the data in binary forms. The binary formats are a lot smaller than their text equivalent and hence performance increase is typically possible by the use of binary format. There are a lot of binary file formats which are proprietary and go against free software principles. If you have the option, it is typically best to evade such formats. In this chapter you will learn about both the Excel and Binary format along with their packages within R in brief.
Install the xlsx Package
There is a command which is generally used in the R console for installing the "xlsx" package. This might ask you to install some additional packages on which xlsx package is dependent. Here is the command of how to install and use xlsx packages:
install. packages ("xlsx")
After your xlsx package get installed, you can verify it using the command:
any (grepl ("xlsx", installed.packages()))
Reading the Excel File
Any .xlsx file can be read by using the read.xlsx() function. An example is shown below. The result gets stored as a data frame within the R environment.
info <- read.xlsx ("filename.xlsx", sheetIndex = 1)
print (info)
The Binary File Format
A binary file is a type of file which holds information that are in the form of bits and bytes only i.e. 0's and 1's. This information cannot be read by human as the bytes in it translate to characters and symbols that contain various other non-printable characters.
It may happen at times when the data produced by other programs are essential to be processed by R language as a binary file. Moreover R is necessarily responsible for creating binary files that can be shared with other programs. R provides two different functions for dealing with binary files. These are:
- WriteBin() and
- readBin()
1st one is for creating files and 2nd is for reading binary files. The syntax of writing them is as follows:
writeBin (object, con)
readBin (con, what, n )
Here, the parameters used are:
- con is the connection object used for reading or writing the binary file
- object here, is a binary file that to be written
- what is the mode which can be character, integer etc and it represents the readable bytes.
- n is the amount of bytes for reading from the binary file.