• Anybody here program in R and want to help me out?
    2 replies, posted
Hey, so in my college course we are doing Probability and Statistics and we use a program called R, and we have been given a project due in around 2 weeks. So I wanted to make a start on the project but I'm not too sure on how to use the program, and also, the question asked for the project seems a bit tricky to do as well. If anybody could point me in the right direction of how to attempt this question it would be greatly appreciated. There is a second part but it seems much easier than this question. Thanks, I'll leave the question below. [quote] You have the task of using R to analyse the results of a popular personality test – Eysenck Personality Inventory (EPI). Data consists of scores obtained from a series of questions that measure person’s traits (0 – lowest, 3 – highest), from a set of students of Northwestern University. Based on those scores, EPI index measures how much the person is Extroverted/Introverted and Stable/Neurotic on a scale from 0 – 24 (0 – introversion/stability, 24 – extroversion/neuroticism). It has 5 variables – Extroversion score (epiE), Sociablity score (epiS), Impulsivity score (epiImp), Lie score (epiLie - measures how much the subject was lying on the test) and Neuroticism score (epiNeur). The data file can be found at [URL]http://www.computing.dcu.ie/~mbezbradica/teaching/CA266/epi.txt[/URL] * 1. Obtain summary information for all the variables in the data set (e.g. means, standard deviations, quantiles). What can you tell about the data at first glance? 2. Compare EPI index variables amongst each other (hint: use subset function to extract subsets of variables from the original data set). Where can you see strong correlation and which variables are not correlated?For analysis use: - stem and leaf diagrams - boxplots - scatterplots 3. Load the data from a second personality study from [URL]http://www.computing.dcu.ie/~mbezbradica/teaching/CA266/bfi.txt*[/URL]. This data includes the profiles of participants - Age; Gender (1 – males, 2 – females); Education(1 = HS, 2 = finished HS, 3 = some college, 4 = college graduate 5 = graduate degree); 4. Plot histograms showing the age profiles of males and females. What can you tell about them? 5. Plot and compare participant gender vs. their education profile *The data set contains NA values where the person did not answer the posed question[/quote]
R has documentation and functions built into it. For instance for a histogram just hist(x), and you could use instead kernal density estimation which is also built it in. hehe it also have a build in command summary(x). I'm not sure how to explain or show a good tutorial on R, (if you have access to springer there are books on this: [url]http://www.springerlink.com/content/978-0-387-79053-4/#section=215103&page=1[/url]) ------ Well I'm having a hard time with the text file in that R is inserting it's own column names and in the first row in the matrix's data are the names, (reading bdf doesn't have this problem). There's a way to deal with this but I forget. Mostly everything you can google the commands for, standard deviation is sd(x), m.a.d. is mad(x). In fact if you type in help(sd) R will open a web browser and open the link to the help documentation and I haven't seen one without example code yet. I'll look into it, so far read.table("bfi.txt") works fine.
Thanks for the help mate I'll look into it! Did you have any look with the other file?
Sorry, you need to Log In to post a reply to this thread.