-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathDataFrames.html
52 lines (52 loc) · 3.45 KB
/
DataFrames.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; }
code > span.dt { color: #902000; }
code > span.dv { color: #40a070; }
code > span.bn { color: #40a070; }
code > span.fl { color: #40a070; }
code > span.ch { color: #4070a0; }
code > span.st { color: #4070a0; }
code > span.co { color: #60a0b0; font-style: italic; }
code > span.ot { color: #007020; }
code > span.al { color: #ff0000; font-weight: bold; }
code > span.fu { color: #06287e; }
code > span.er { color: #ff0000; font-weight: bold; }
</style>
<link rel="stylesheet" href="Class.css" type="text/css" />
</head>
<body>
<h1 id="data-frames">Data Frames</h1>
<p>A data frame is a 2-dimensional data structure. It is different from a matrix. The rows are observations, the columns are variables. All columns/variables must have the same number of elements and they are expected to be aligned so that the i-th element in each column corresponds to the same i-th observational unit.</p>
<p>The purpose of a data frame is to allow each column have a different type. This allows us to have integers in one column, logical values in another, Dates in another, and even a vector of more complex objects, e.g., each element in a column might be a data frame itself, or a matrix, or a function.</p>
<p>A data frame is a list. Query this with typeof().</p>
<p>So we can use list subsetting</p>
<pre class="sourceCode r"><code class="sourceCode r">mtcars[ <span class="kw">c</span>(<span class="dv">1</span>, <span class="dv">2</span>) ]
mtcars[ <span class="kw">c</span>(<span class="st">"mpg"</span>, <span class="st">"wt"</span>) ]
mtcars[ <span class="kw">grepl</span>(<span class="st">"^d"</span>, <span class="kw">names</span>(mtcars) ) ]
mtcars[[ <span class="st">"mpg"</span> ]]
mtcars$mpg</code></pre>
<p>When we assign a value to a column, e.g.,</p>
<pre class="sourceCode r"><code class="sourceCode r">mtcars$old =<span class="st"> </span><span class="ot">TRUE</span></code></pre>
<p>the recycling rule is used. R ensures that each column (element of the list) of the data.frame has the same length. So R repeats TRUE nrow(mtcars) times.</p>
<p>So what does</p>
<pre class="sourceCode r"><code class="sourceCode r">mtcars$old =<span class="st"> </span><span class="kw">c</span>(<span class="ot">TRUE</span>, <span class="ot">FALSE</span>)</code></pre>
<p>yield?</p>
<p>And what does</p>
<pre class="sourceCode r"><code class="sourceCode r">mtcars$old =<span class="st"> </span><span class="kw">c</span>(<span class="ot">TRUE</span>, <span class="ot">FALSE</span>, <span class="ot">TRUE</span>)</code></pre>
<p>do?</p>
<p>We can use 2-dimensional subsetting also. See <a href="Subsetting2D.md">Subsetting2D.md</a></p>
</body>
</html>