/Users/kjhealy/Documents/courses/mptc/slides/01b-slides.qmd
Modern Plain Text Computing
Week 02
November 5, 2024
/
represents a division in the file hierarchy. You can think of it as a branch point on a tree, or as a new level of nesting in a series of boxes, or as the action “Go inside” or “Enter”.
On a Unix-like system, a full path to a file looks like this:
/Users/kjhealy/Documents/courses/mptc/slides/01b-slides.qmd
“Go inside the ‘Users
’ folder, then inside the ‘kjhealy
’ folder, then inside ‘Documents
’ then inside ‘courses
’ then ‘mptc
’ then ‘slides
’ and you will find the file 01b-slides.qmd
.”
/
: root. Everything lives inside or under the root./bin/
: For binaries. Core user executable programs and tools./sbin/
: System binaries. Essential executables for the super user (who is also called root
)/lib/
: Support files for executables./usr/
: Conventionally, stuff installed “locally” for users in addition to the core system. Will contain its own bin/
and lib/
subdirs./usr/local
: Files that the local user has compiled or installed/opt/
: Like /usr/
, another place for locally installed software to go.$PATH
, which is an environment variable that tells the system where executables can be found.:
and searched in order from left to right.which
/
: root. Everything lives inside or under the root./bin/
: For binaries. Core user executable programs and tools./sbin/
: System binaries. Essential executables for the super user (who is also called root
)/lib/
: Support files for executables./usr/
: Conventionally, stuff installed “locally” for users in addition to the core system. Will contain its own bin/
and lib/
subdirs./usr/local
: Files that the local user has compiled or installed/opt/
: Like /usr/
, another place for locally installed software to go./etc/
: Editable text configuration. Config files often go here./home/
or /Users/
: Where the accounts of individual system users live, like /Users/kjhealy
or /home/kjhealy
/home
. On macOS they live in /Users
. Windows is different again (and uses \
for file paths rather than /
.)/
tree├── Applications
├── bin
├── cores
├── dev
├── etc -> private/etc
├── home -> /System/Volumes/Data/home
├── Library
├── opt
│ ├── homebrew
├── private
│ ├── etc
│ ├── tftpboot
│ ├── tmp
│ └── var
├── sbin
├── System
├── tmp -> private/tmp
├── Users
│ ├── kjhealy
│ └── Shared
├── usr
│ ├── bin
│ ├── lib
│ ├── libexec
│ ├── local
│ ├── sbin
│ ├── share
│ ├── standalone
├── var -> private/var
└── Volumes
├── Applications
├── bin
├── Box
├── Creative Cloud Files
├── Desktop
├── Documents
│ ├── bibs -> /Users/kjhealy/Library/texmf/bibtex/bib
│ ├── bookdown
│ ├── comments
│ ├── completed
│ ├── courses
│ ├── data
│ ├── letters
│ ├── misc
│ ├── nonsense
│ ├── ordinal-society
│ ├── papers
│ ├── sites
│ ├── source
│ ├── talks
│ ├── teaching
│ ├── templates
│ ├── vita
├── Downloads
├── Dropbox
├── Library
├── Movies
├── Music
├── Pictures
├── Public
├── scratch
├── tmp
└── Zotero
So, how do we make our way around this file hierarchy tree and how do we take actions and do things?
sh
.bash
or the Bourne-Again Shell.zsh
.A shell is an interpreter. It waits for commands. When you supply them, it does what you tell it, or tells the relevant bit of the operating system to do what you said.
This mode of interacting with a computer is sometimes called a REPL or Read-Eval-Print Loop.
Programming languages like Python and R work this way as well. So does ChatGPT. Shell commands (and R and Python commands, and scripts) They are interpreted, meaning code is sent to an interpreter (the Python or R program) that runs the code.
This is distinct from languages (at least originally) designed to be compiled into executable machine code before they are run. Languages like C, Go, and Rust are in this category.
Who am I?
Where am I?
What is in here?
R
README.md
README.qmd
README_files
_extensions
_freeze
_quarto.yml
_site
_targets
_targets.R
_variables.yml
about
assets
assignment
avhrr
content
data
deploy.sh
example
files
html
index.html
index.qmd
mptc.Rproj
renv
renv.lock
renv.lock.orig
schedule
seas
site_libs
slides
staging
syllabus
Who am I?
Where am I?
What is my purpose in life?
total 408
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 2023 R
-rw-r--r--@ 1 kjhealy staff 1967 Sep 18 2023 README.md
-rw-r--r-- 1 kjhealy staff 1764 Jan 23 2024 README.qmd
drwxr-xr-x@ 3 kjhealy staff 96 Sep 18 2023 README_files
drwxr-xr-x 3 kjhealy staff 96 Jan 9 2024 _extensions
drwxr-xr-x@ 10 kjhealy staff 320 Nov 5 10:57 _freeze
-rw-r--r--@ 1 kjhealy staff 4750 Nov 5 10:34 _quarto.yml
drwxr-xr-x@ 2 kjhealy staff 64 Nov 5 10:56 _site
drwxr-xr-x@ 7 kjhealy staff 224 Nov 5 10:54 _targets
-rw-r--r--@ 1 kjhealy staff 2737 Sep 27 17:40 _targets.R
-rw-r--r--@ 1 kjhealy staff 997 Aug 26 14:40 _variables.yml
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 about
drwxr-xr-x@ 16 kjhealy staff 512 Nov 5 10:12 assets
drwxr-xr-x@ 22 kjhealy staff 704 Nov 5 10:57 assignment
lrwxr-xr-x 1 kjhealy staff 135 Nov 5 10:23 avhrr -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 content
drwxr-xr-x@ 5 kjhealy staff 160 Nov 5 10:14 data
-rwxr-xr-x@ 1 kjhealy staff 437 Oct 23 08:19 deploy.sh
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 example
drwxr-xr-x@ 12 kjhealy staff 384 Oct 23 09:18 files
drwxr-xr-x 14 kjhealy staff 448 Jan 8 2024 html
-rw-r--r--@ 1 kjhealy staff 50673 Nov 5 10:57 index.html
-rw-r--r--@ 1 kjhealy staff 6937 Oct 23 10:59 index.qmd
-rw-r--r--@ 1 kjhealy staff 258 Oct 29 17:52 mptc.Rproj
drwxr-xr-x@ 7 kjhealy staff 224 Aug 15 2023 renv
-rw-r--r--@ 1 kjhealy staff 63998 Nov 4 08:04 renv.lock
-rw-r--r-- 1 kjhealy staff 46717 Dec 11 2023 renv.lock.orig
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 schedule
lrwxr-xr-x 1 kjhealy staff 66 Nov 5 10:23 seas -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/World_Seas_IHO_v3
drwxr-xr-x@ 11 kjhealy staff 352 Nov 5 10:57 site_libs
drwxr-xr-x@ 21 kjhealy staff 672 Nov 5 10:57 slides
drwxr-xr-x 6 kjhealy staff 192 Aug 28 09:24 staging
drwxr-xr-x 3 kjhealy staff 96 Nov 5 10:54 syllabus
Note the idea of commands having options, or switches.
/
, it is an absolute path, starting from the filesystem root.~
, it will usually be expanded into an absolute path name starting at your home directory (~
).If the pathname does not begin with a /
or ~
then the path name is relative to the current directory.
Two relative special cases use entries that are in every Unix directory:
./
, the path is relative to the current directory, e.g., ./textfile
, though this can also execute the file if it is given executable file permissions.../
, the path is relative to the parent of the current directory. For example, if your current directory is /Users/kjhealy/Documents/papers
then ../data
means /Users/kjhealy/Documents/data
Who is using this file system anyway?
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 16:35 R
-rw-r--r--@ 1 kjhealy staff 1210 Aug 15 20:29 README.md
Unix derives from a world there there are multiple users and groups of users who are all using slices (in terms of processor time and available permanent storage) of a large central computer.
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 16:35 R
-rw-r--r--@ 1 kjhealy staff 1210 Aug 15 20:29 README.md
In Unix systems there are three kinds of owner: the user (here kjhealy
), the group (here staff
), and others or other users on the system.
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 16:35 R
-rw-r--r--@ 1 kjhealy staff 1210 Aug 15 20:29 README.md
Three things you can do to a file:
cd
❯ ls -l README.md
-rw-r--r--@ 1 kjhealy staff 1210 Aug 15 20:29 README.md
These permissions say rw-r--r--
or
rw-
read and write this filer--
read this filer--
read this fileExecutable permissions are irrelevant here because it’s a text file.
chmod
command. So e.g. chmod 644 README.md
means “change the permissions to rw-r--r--
”.├── schedule
├── staging
│ ├── example
│ ├── content
│ ├── assignment
│ ├── slides
├── README_files
│ ├── libs
├── example
│ ├── 09-example_files
│ ├── 07-example_files
├── R
├── content
├── assignment
├── html
│ ├── fonts
├── site_libs
│ ├── revealjs
│ ├── bootstrap
│ ├── quarto-html
│ ├── quarto-contrib
│ ├── quarto-nav
│ ├── quarto-search
│ ├── lightable-0.0.1
│ ├── kePrint-0.0.1
│ ├── clipboard
├── about
├── slides
│ ├── 00-slides_files
│ ├── 02-slides_files
├── syllabus
├── _extensions
│ ├── kjhealy
├── _site
├── files
│ ├── misc
│ ├── examples
│ ├── scripts
│ ├── bib
├── .git
│ ├── objects
│ ├── info
│ ├── logs
│ ├── hooks
│ ├── refs
│ ├── modules
├── _targets
│ ├── meta
│ ├── objects
│ ├── user
│ ├── workspaces
├── renv
│ ├── staging
│ ├── library
├── data
├── assets
│ ├── 03-editors
│ ├── 04-r
│ ├── 10-parallel
│ ├── 04-git
│ ├── 08-iterate
│ ├── 00-site
│ ├── 02-shell
│ ├── 01-file-system
│ ├── 07-ingest
│ ├── 05-dplyr
│ ├── 06-build
├── _freeze
│ ├── schedule
│ ├── example
│ ├── content
│ ├── assignment
│ ├── site_libs
│ ├── slides
│ ├── syllabus
│ ├── index
├── .Rproj.user
│ ├── B6516D0D
│ ├── shared
├── .quarto
│ ├── xref
│ ├── idx
│ ├── preview
│ ├── _freeze
Project at: https://github.com/kjhealy/mptc_text_examples
Download the zip file, for now
01_mptc_oecd_nocode.pdf
01_mptc_oecd_withcode.pdf
SAS_on_2021-04-13.csv
_make-example
alice_in_wonderland.txt
alice_noboiler.txt
apple_mobility_daily_2021-04-12.csv
ascii_table.xlsx
bashrc.txt
basics.txt
congress
continent_sizes.csv
continent_tab.csv
continent_tab.tsv
countries.csv
countries_iso3.csv
country-intermediate.tsv
country-working.tsv
country_iso3.tsv
country_tab.csv
country_tab.tsv
fars0-17daily.csv
fars_crash_report.xlsx
first_terms.csv
fruit.txt
gapminder_xtra.csv
gss_panel_long.dta
jabberwocky.txt
mortality.txt
organdonation.csv
pride_and_prejudice.txt
rfm_table.csv
roman.txt
sentences.txt
shalott_1832.txt
shalott_1842.txt
specials.txt
symptoms.xlsx
ulysses.txt
words.txt
year_tab.tsv
zshrc.txt
pwd
, cd
, and ls
.wc
, cat
, head
, and tail
We can ask for a count of lines only:
wc
, cat
, head
, and tail
cat
concatenates and prints the files given to it.
’Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!”
He took his vorpal sword in hand;
Long time the manxome foe he sought—
So rested he by the Tumtum tree
And stood awhile in thought.
And, as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!
One, two! One, two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.
“And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!”
He chortled in his joy.
’Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
wc
, cat
, head
, and tail
The top:
The Project Gutenberg eBook of Alice's Adventures in Wonderland
This ebook is for the use of anyone anywhere in the United States and
most other parts of the world at no cost and with almost no restrictions
whatsoever. You may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this ebook or online
at www.gutenberg.org. If you are not located in the United States,
you will have to check the laws of the country where you are located
before using this eBook.
The bottom:
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how to
subscribe to our email newsletter to hear about new eBooks.
wc
, cat
, head
, and tail
There are 29 lines of boilerplate at the start of the book:
The Project Gutenberg eBook of Alice's Adventures in Wonderland
This ebook is for the use of anyone anywhere in the United States and
most other parts of the world at no cost and with almost no restrictions
whatsoever. You may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this ebook or online
at www.gutenberg.org. If you are not located in the United States,
you will have to check the laws of the country where you are located
before using this eBook.
Title: Alice's Adventures in Wonderland
Author: Lewis Carroll
Release date: June 27, 2008 [eBook #11]
Most recently updated: March 30, 2021
Language: English
Credits: Arthur DiBianca and David Widger
*** START OF THE PROJECT GUTENBERG EBOOK ALICE'S ADVENTURES IN WONDERLAND ***
[Illustration]
wc
, cat
, head
, and tail
And 351 at the end:
*** END OF THE PROJECT GUTENBERG EBOOK ALICE'S ADVENTURES IN WONDERLAND ***
Updated editions will replace the previous one—the old editions will
be renamed.
Creating the works from print editions not protected by U.S. copyright
law means that no one owns a United States copyright in these works,
so the Foundation (and you!) can copy and distribute it in the United
States without permission and without paying copyright
royalties. Special rules, set forth in the General Terms of Use part
of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is very
wc
, cat
, head
, and tail
We can use tail
to skip the boilerplate at the top:
wc
, cat
, head
, and tail
The shell can be treated like a programming language. That is, it has variables and also flow control (loops, if-then-else, etc).
We can use some shell variables along with tail
twice to skip the boilerplate at the top and bottom, and put the result into a file of its own using >
to redirect the output from STDOUT
:
# This sets HEADSKIP to 29 and ENDSKIP to 351;
# We can refer to them with $HEADSKIP and $ENDSKIP
HEADSKIP=29
ENDSKIP=351
# The backticks ` ` here mean "Evaluate this command"; then put the result in a variable
BOOKLINES=`cat files/examples/alice_in_wonderland.txt| wc -l | tr ' ' '\n' | tail -1`
# This line does the arithmetic using expr and makes the result a variable
GOODLINES=$(expr $BOOKLINES - $HEADSKIP - $ENDSKIP)
# Now we use $HEADKSIP and $GOODLINES and create a new file
tail -n +$HEADSKIP files/examples/alice_in_wonderland.txt |
head -n $GOODLINES > files/examples/alice_noboiler.txt
wc
, cat
, head
, and tail
Now our wc
will be different:
uniq
, sort
, and cut
A data file:
cname,iso3,iso2,continent
Afghanistan,AFG,AF,Asia
Algeria,DZA,DZ,Africa
Armenia,ARM,AM,Asia
Australia,AUS,AU,Oceania
Austria,AUT,AT,Europe
Azerbaijan,AZE,AZ,Asia
Bahrain,BHR,BH,Asia
Belarus,BLR,BY,Europe
Belgium,BEL,BE,Europe
How many lines?
How many unique lines?
uniq
, sort
, and cut
Zimbabwe,ZWE,ZW,Africa
Zambia,ZMB,ZM,Africa
Yemen,YEM,YE,Asia
Western Sahara,ESH,EH,Africa
Wallis and Futuna,WLF,WF,Oceania
Viet Nam,VNM,VN,Asia
Vanuatu,VUT,VU,Oceania
Uzbekistan,UZB,UZ,Asia
Uruguay,URY,UY,South America
United States,USA,US,North America
uniq
, sort
, and cut
This doesn’t quite work because of the way the data is coded:
Algeria,DZA,DZ,Africa
Angola,AGO,AO,Africa
Benin,BEN,BJ,Africa
Botswana,BWA,BW,Africa
Burkina Faso,BFA,BF,Africa
Burundi,BDI,BI,Africa
Cabo Verde,CPV,CV,Africa
Cameroon,CMR,CM,Africa
Central African Republic,CAF,CF,Africa
Chad,TCD,TD,Africa
Comoros,COM,KM,Africa
Congo,COG,CG,Africa
Côte d'Ivoire,CIV,CI,Africa
Djibouti,DJI,DJ,Africa
Egypt,EGY,EG,Africa
Equatorial Guinea,GNQ,GQ,Africa
Eritrea,ERI,ER,Africa
Ethiopia,ETH,ET,Africa
Gabon,GAB,GA,Africa
Gambia,GMB,GM,Africa
Ghana,GHA,GH,Africa
Guinea,GIN,GN,Africa
Guinea-Bissau,GNB,GW,Africa
Kenya,KEN,KE,Africa
Lesotho,LSO,LS,Africa
Liberia,LBR,LR,Africa
Libya,LBY,LY,Africa
Madagascar,MDG,MG,Africa
Malawi,MWI,MW,Africa
Mali,MLI,ML,Africa
Mauritania,MRT,MR,Africa
Mauritius,MUS,MU,Africa
Morocco,MAR,MA,Africa
Mozambique,MOZ,MZ,Africa
Namibia,"NAM",NA,Africa
Niger,NER,NE,Africa
Nigeria,NGA,NG,Africa
Rwanda,RWA,RW,Africa
Sao Tome and Principe,STP,ST,Africa
Senegal,SEN,SN,Africa
Seychelles,SYC,SC,Africa
Sierra Leone,SLE,SL,Africa
Somalia,SOM,SO,Africa
South Africa,ZAF,ZA,Africa
South Sudan,SSD,SS,Africa
Sudan,SDN,SD,Africa
Swaziland,SWZ,SZ,Africa
Togo,TGO,TG,Africa
Tunisia,TUN,TN,Africa
Uganda,UGA,UG,Africa
Western Sahara,ESH,EH,Africa
Zambia,ZMB,ZM,Africa
Zimbabwe,ZWE,ZW,Africa
Afghanistan,AFG,AF,Asia
Armenia,ARM,AM,Asia
Azerbaijan,AZE,AZ,Asia
Bahrain,BHR,BH,Asia
Bangladesh,BGD,BD,Asia
Bhutan,BTN,BT,Asia
Brunei Darussalam,BRN,BN,Asia
Cambodia,KHM,KH,Asia
China,CHN,CN,Asia
Georgia,GEO,GE,Asia
India,IND,IN,Asia
Indonesia,IDN,ID,Asia
Iraq,IRQ,IQ,Asia
Israel,ISR,IL,Asia
Japan,JPN,JP,Asia
Jordan,JOR,JO,Asia
Kazakhstan,KAZ,KZ,Asia
Kuwait,KWT,KW,Asia
Kyrgyzstan,KGZ,KG,Asia
Lao People's Democratic Republic,LAO,LA,Asia
Lebanon,LBN,LB,Asia
Malaysia,MYS,MY,Asia
Maldives,MDV,MV,Asia
Mongolia,MNG,MN,Asia
Myanmar,MMR,MM,Asia
Nepal,NPL,NP,Asia
Oman,OMN,OM,Asia
Pakistan,PAK,PK,Asia
Philippines,PHL,PH,Asia
Qatar,QAT,QA,Asia
Saudi Arabia,SAU,SA,Asia
Singapore,SGP,SG,Asia
Sri Lanka,LKA,LK,Asia
Syrian Arab Republic,SYR,SY,Asia
Tajikistan,TJK,TJ,Asia
Thailand,THA,TH,Asia
Turkey,TUR,TR,Asia
United Arab Emirates,ARE,AE,Asia
Uzbekistan,UZB,UZ,Asia
Viet Nam,VNM,VN,Asia
Yemen,YEM,YE,Asia
"Bolivia, Plurinational State of",BOL,BO,South America
"Bonaire, Sint Eustatius and Saba",BES,BQ,North America
"Congo, the Democratic Republic of the",COD,CD,Africa
Albania,ALB,AL,Europe
Andorra,AND,AD,Europe
Austria,AUT,AT,Europe
Belarus,BLR,BY,Europe
Belgium,BEL,BE,Europe
Bosnia and Herzegovina,BIH,BA,Europe
Bulgaria,BGR,BG,Europe
Croatia,HRV,HR,Europe
Cyprus,CYP,CY,Europe
Czech Republic,CZE,CZ,Europe
Denmark,DNK,DK,Europe
Estonia,EST,EE,Europe
Faroe Islands,FRO,FO,Europe
Finland,FIN,FI,Europe
France,FRA,FR,Europe
Germany,DEU,DE,Europe
Gibraltar,GIB,GI,Europe
Greece,GRC,GR,Europe
Guernsey,GGY,GG,Europe
Holy See (Vatican City State),VAT,VA,Europe
Hungary,HUN,HU,Europe
Iceland,ISL,IS,Europe
Ireland,IRL,IE,Europe
Isle of Man,IMN,IM,Europe
Italy,ITA,IT,Europe
Jersey,JEY,JE,Europe
Kosovo,XKV,NA,Europe
Latvia,LVA,LV,Europe
Liechtenstein,LIE,LI,Europe
Lithuania,LTU,LT,Europe
Luxembourg,LUX,LU,Europe
Malta,MLT,MT,Europe
Monaco,MCO,MC,Europe
Montenegro,MNE,ME,Europe
Netherlands,NLD,NL,Europe
Norway,NOR,NO,Europe
Poland,POL,PL,Europe
Portugal,PRT,PT,Europe
Romania,ROU,RO,Europe
Russian Federation,RUS,RU,Europe
San Marino,SMR,SM,Europe
Serbia,SRB,RS,Europe
Slovakia,SVK,SK,Europe
Slovenia,SVN,SI,Europe
Spain,ESP,ES,Europe
Sweden,SWE,SE,Europe
Switzerland,CHE,CH,Europe
Ukraine,UKR,UA,Europe
United Kingdom,GBR,GB,Europe
"Iran, Islamic Republic of",IRN,IR,Asia
"Korea, Republic of",KOR,KR,Asia
"Moldova, Republic of",MDA,MD,Europe
"Macedonia, the former Yugoslav Republic of",MKD,MK,Europe
Anguilla,AIA,AI,North America
Antigua and Barbuda,ATG,AG,North America
Aruba,ABW,AW,North America
Bahamas,BHS,BS,North America
Barbados,BRB,BB,North America
Belize,BLZ,BZ,North America
Bermuda,BMU,BM,North America
Canada,CAN,CA,North America
Cayman Islands,CYM,KY,North America
Costa Rica,CRI,CR,North America
Cuba,CUB,CU,North America
Curaçao,CUW,CW,North America
Dominica,DMA,DM,North America
Dominican Republic,DOM,DO,North America
El Salvador,SLV,SV,North America
Greenland,GRL,GL,North America
Grenada,GRD,GD,North America
Guatemala,GTM,GT,North America
Haiti,HTI,HT,North America
Honduras,HND,HN,North America
Jamaica,JAM,JM,North America
Mexico,MEX,MX,North America
Montserrat,MSR,MS,North America
Nicaragua,NIC,NI,North America
Panama,PAN,PA,North America
Puerto Rico,PRI,PR,North America
Saint Kitts and Nevis,KNA,KN,North America
Saint Lucia,LCA,LC,North America
Saint Vincent and the Grenadines,VCT,VC,North America
Sint Maarten (Dutch part),SXM,SX,North America
Trinidad and Tobago,TTO,TT,North America
Turks and Caicos Islands,TCA,TC,North America
United States,USA,US,North America
Australia,AUS,AU,Oceania
Fiji,FJI,FJ,Oceania
French Polynesia,PYF,PF,Oceania
Guam,GUM,GU,Oceania
Marshall Islands,MHL,MH,Oceania
New Caledonia,NCL,NC,Oceania
New Zealand,NZL,NZ,Oceania
Northern Mariana Islands,MNP,MP,Oceania
Papua New Guinea,PNG,PG,Oceania
Solomon Islands,SLB,SB,Oceania
Timor-Leste,TLS,TL,Oceania
Vanuatu,VUT,VU,Oceania
Wallis and Futuna,WLF,WF,Oceania
"Palestine, State of",PSE,PS,Asia
Argentina,ARG,AR,South America
Brazil,BRA,BR,South America
Chile,CHL,CL,South America
Colombia,COL,CO,South America
Ecuador,ECU,EC,South America
Falkland Islands (Malvinas),FLK,FK,South America
Guyana,GUY,GY,South America
Paraguay,PRY,PY,South America
Peru,PER,PE,South America
Suriname,SUR,SR,South America
Uruguay,URY,UY,South America
"Taiwan, Province of China",TWN,TW,Asia
"Tanzania, United Republic of",TZA,TZ,Africa
"Venezuela, Bolivarian Republic of",VEN,VE,South America
"Virgin Islands, British",VGB,VG,North America
"Virgin Islands, U.S.",VIR,VI,North America
uniq
, sort
, and cut
cut
slices out columns defined by a delimiter (by default \t
or tab)
iso3,continent
AFG,Asia
DZA,Africa
ARM,Asia
AUS,Oceania
AUT,Europe
AZE,Asia
BHR,Asia
BLR,Europe
BEL,Europe
BRA,South America
KHM,Asia
CAN,North America
CHN,Asia
HRV,Europe
CZE,Europe
DNK,Europe
DOM,North America
ECU,South America
EGY,Africa
EST,Europe
FIN,Europe
FRA,Europe
GEO,Asia
DEU,Europe
GRC,Europe
ISL,Europe
IND,Asia
IDN,Asia
Islamic Republic of",IR
IRQ,Asia
IRL,Europe
ISR,Asia
ITA,Europe
JPN,Asia
KWT,Asia
LBN,Asia
LTU,Europe
LUX,Europe
MYS,Asia
MEX,North America
MCO,Europe
NPL,Asia
NLD,Europe
NZL,Oceania
NGA,Africa
the former Yugoslav Republic of",MK
NOR,Europe
OMN,Asia
PAK,Asia
PHL,Asia
QAT,Asia
ROU,Europe
RUS,Europe
SMR,Europe
SGP,Asia
Republic of",KR
ESP,Europe
LKA,Asia
SWE,Europe
CHE,Europe
Province of China",TW
THA,Asia
ARE,Asia
GBR,Europe
USA,North America
VNM,Asia
AND,Europe
JOR,Asia
LVA,Europe
MAR,Africa
PRT,Europe
SAU,Asia
SEN,Africa
SXM,North America
TUN,Africa
ARG,South America
CHL,South America
POL,Europe
UKR,Europe
HUN,Europe
LIE,Europe
SVN,Europe
BTN,Asia
BIH,Europe
FRO,Europe
State of",PS
ZAF,Africa
CMR,Africa
COL,South America
CRI,North America
VAT,Europe
MLT,Europe
PER,South America
SRB,Europe
SVK,Europe
TGO,Africa
BGR,Europe
MDV,Asia
Republic of",MD
PRY,South America
ALB,Europe
BGD,Asia
BRN,Asia
CYP,Europe
MNG,Asia
PAN,North America
BFA,Africa
the Democratic Republic of the",CD
Plurinational State of",BO
CIV,Africa
CUB,North America
HND,North America
JAM,North America
TUR,Asia
ABW,North America
CUW,North America
GAB,Africa
GHA,Africa
GUY,South America
VCT,North America
TTO,North America
ETH,Africa
GIN,Africa
KEN,Africa
XKV,Europe
SDN,Africa
ATG,North America
GNQ,Africa
SWZ,Africa
GTM,North America
KAZ,Asia
MRT,Africa
"NAM",Africa
RWA,Africa
LCA,North America
SYC,Africa
SUR,South America
URY,South America
Bolivarian Republic of",VE
BHS,North America
CAF,Africa
COG,Africa
UZB,Asia
BEN,Africa
LBR,Africa
MMR,Asia
SOM,Africa
United Republic of",TZ
BRB,North America
GMB,Africa
MNE,Europe
DJI,Africa
SLV,North America
PYF,Oceania
GUM,Oceania
KGZ,Asia
NIC,North America
ZMB,Africa
BMU,North America
CYM,North America
TCD,Africa
FJI,Oceania
GIB,Europe
GRL,North America
GGY,Europe
HTI,North America
JEY,Europe
MUS,Africa
CPV,Africa
IMN,Europe
MDG,Africa
MSR,North America
NCL,Oceania
NER,Africa
PNG,Oceania
ZWE,Africa
AGO,Africa
ERI,Africa
TLS,Oceania
UGA,Africa
DMA,North America
GRD,North America
MOZ,Africa
SYR,Asia
BLZ,North America
U.S.",VI
LAO,Asia
LBY,Africa
TCA,North America
MLI,Africa
KNA,North America
AIA,North America
British",VG
GNB,Africa
PRI,North America
MNP,Oceania
BWA,Africa
BDI,Africa
SLE,Africa
Sint Eustatius and Saba",BQ
MWI,Africa
FLK,South America
SSD,Africa
STP,Africa
YEM,Asia
ESH,Africa
TJK,Asia
COM,Africa
LSO,Africa
SLB,Oceania
WLF,Oceania
MHL,Oceania
VUT,Oceania
Again in this case it doesn’t quite behave as you might think!
find
find
is for locating files and directories by name:
files
files/misc
files/misc/home-tree.txt
files/misc/root-tree.txt
files/.DS_Store
files/schedule.ics
files/01_apple_macintosh.png
files/01_bryant_hard_drive.png
files/fars_spreadsheet_raw.png
files/examples
files/examples/country_iso3.tsv
files/examples/jabberwocky.txt
files/examples/country_tab.csv
files/examples/ulysses.txt
files/examples/_make-example
files/examples/_make-example/mypaper.md
files/examples/_make-example/fig1.r
files/examples/_make-example/Makefile
files/examples/_make-example/README.md
files/examples/_make-example/.gitignore
files/examples/_make-example/.RData
files/examples/rfm_table.csv
files/examples/01_mptc_oecd_nocode.pdf
files/examples/.DS_Store
files/examples/countries.csv
files/examples/specials.txt
files/examples/gapminder_xtra.csv
files/examples/bashrc.txt
files/examples/apple_mobility_daily_2021-04-12.csv
files/examples/alice_in_wonderland.txt
files/examples/continent_tab.tsv
files/examples/first_terms.csv
files/examples/symptoms.xlsx
files/examples/roman.txt
files/examples/fruit.txt
files/examples/shalott_1832.txt
files/examples/year_tab.tsv
files/examples/fars_crash_report.xlsx
files/examples/organdonation.csv
files/examples/continent_tab.csv
files/examples/pride_and_prejudice.txt
files/examples/basics.txt
files/examples/01_mptc_oecd_withcode.pdf
files/examples/continent_sizes.csv
files/examples/country-intermediate.tsv
files/examples/SAS_on_2021-04-13.csv
files/examples/country-working.tsv
files/examples/words.txt
files/examples/mortality.txt
files/examples/sentences.txt
files/examples/ascii_table.xlsx
files/examples/gss_panel_long.dta
files/examples/congress
files/examples/congress/23_101_congress.csv
files/examples/congress/28_106_congress.csv
files/examples/congress/08_86_congress.csv
files/examples/congress/05_83_congress.csv
files/examples/congress/31_109_congress.csv
files/examples/congress/24_102_congress.csv
files/examples/congress/16_94_congress.csv
files/examples/congress/37_115_congress.csv
files/examples/congress/13_91_congress.csv
files/examples/congress/25_103_congress.csv
files/examples/congress/30_108_congress.csv
files/examples/congress/01_79_congress.csv
files/examples/congress/09_87_congress.csv
files/examples/congress/36_114_congress.csv
files/examples/congress/17_95_congress.csv
files/examples/congress/22_100_congress.csv
files/examples/congress/04_82_congress.csv
files/examples/congress/29_107_congress.csv
files/examples/congress/12_90_congress.csv
files/examples/congress/15_93_congress.csv
files/examples/congress/11_89_congress.csv
files/examples/congress/35_113_congress.csv
files/examples/congress/06_84_congress.csv
files/examples/congress/26_104_congress.csv
files/examples/congress/03_81_congress.csv
files/examples/congress/32_110_congress.csv
files/examples/congress/18_96_congress.csv
files/examples/congress/21_99_congress.csv
files/examples/congress/07_85_congress.csv
files/examples/congress/10_88_congress.csv
files/examples/congress/33_111_congress.csv
files/examples/congress/14_92_congress.csv
files/examples/congress/02_80_congress.csv
files/examples/congress/38_116_congress.csv
files/examples/congress/34_112_congress.csv
files/examples/congress/20_98_congress.csv
files/examples/congress/27_105_congress.csv
files/examples/congress/19_97_congress.csv
files/examples/fars0-17daily.csv
files/examples/shalott_1842.txt
files/examples/alice_noboiler.txt
files/examples/countries_iso3.csv
files/examples/country_tab.tsv
files/examples/zshrc.txt
files/scripts
files/scripts/hello-world.sh
files/scripts/make-thumbnail.sh
files/bib
files/bib/samplesyllabus.csl
files/bib/american-political-science-association.csl
files/bib/references.bib
files/bib/chicago-fullnote-bibliography-no-bib.csl
files/bib/mptc_references.bib
files/bib/chicago-fullnote-bibliography.csl
files/bib/chicago-syllabus-no-bib.csl
files/bib/apa.csl
files/bib/chicago-author-date.csl
files/bib/.auctex-auto
files/bib/.auctex-auto/references.el
files/bib/chicago-note-bibliography.csl
files/01_1890_hollerith_codes.png
find
We can use globbing (or wildcards) to narrow our search:
# Everything underneath the `files/` subdirectory
# whose name ends in `.csl`
find files -name "*.csl"
files/bib/samplesyllabus.csl
files/bib/american-political-science-association.csl
files/bib/chicago-fullnote-bibliography-no-bib.csl
files/bib/chicago-fullnote-bibliography.csl
files/bib/chicago-syllabus-no-bib.csl
files/bib/apa.csl
files/bib/chicago-author-date.csl
files/bib/chicago-note-bibliography.csl
find
Here we use the .
to mean “Search in the current folder”
find
-exec
option lets us do things with each result.{}
expands to each found file in turn.echo
to see what the rm
(remove) command would do.";"
or \;
is required to end the linerm files/01_apple_macintosh.png
rm files/01_bryant_hard_drive.png
rm files/fars_spreadsheet_raw.png
rm files/01_1890_hollerith_codes.png
If we omitted the echo
here the found files really would be deleted one at a time.
find
We can also use xargs
to act on search results:
# Everything underneath the `files/` subdirectory
# whose name ends in `.png`
find files -name "*.png"
files/01_apple_macintosh.png
files/01_bryant_hard_drive.png
files/fars_spreadsheet_raw.png
files/01_1890_hollerith_codes.png
Convert all these png
files to jpg
:
find
Check:
files/01_apple_macintosh.png
files/01_bryant_hard_drive.png
files/fars_spreadsheet_raw.png
files/01_1890_hollerith_codes.png
files/01_apple_macintosh.jpg
files/01_bryant_hard_drive.jpg
files/fars_spreadsheet_raw.jpg
files/01_1890_hollerith_codes.jpg
Delete them (with another method of deletion):
Obviously you will not be doing this sort of thing every day of the week. But you may well want to programmatically rename, move, convert, or otherwise maniplate files in batches from time to time. Especially if there are a lot of them, the shell can help you.
? ! , . # $ * <space>
and the like) are not a problem.Find all files in or below the project directory that end in .qmd
:
./schedule/index.qmd
./example/04-example.qmd
./example/08-example.qmd
./example/01-example.qmd
./example/02-example.qmd
./example/07-example.qmd
./example/index.qmd
./example/09-example.qmd
./example/05-example.qmd
./example/06-example.qmd
./example/03-example.qmd
./content/09-content.qmd
./content/05-content.qmd
./content/10-content.qmd
./content/06-content.qmd
./content/03-content.qmd
./content/index.qmd
./content/04-content.qmd
./content/01-content.qmd
./content/08-content.qmd
./content/02-content.qmd
./content/07-content.qmd
./assignment/04-assignment.qmd
./assignment/03-assignment.qmd
./assignment/02-assignment.qmd
./assignment/05-assignment.qmd
./assignment/07-assignment.qmd
./assignment/08-assignment.qmd
./assignment/01-assignment.qmd
./assignment/09-assignment.qmd
./assignment/06-assignment.qmd
./assignment/index.qmd
./about/index.qmd
./index.qmd
./slides/08-slides.qmd
./slides/05b-slides.qmd
./slides/05-slides.qmd
./slides/00-slides.qmd
./slides/07-slides.qmd
./slides/02-slides.qmd
./slides/01a-slides.qmd
./slides/01b-slides.qmd
./slides/10-slides.qmd
./slides/09-slides.qmd
./slides/04-slides.qmd
./slides/03-slides.qmd
./slides/06-slides.qmd
./syllabus/index.qmd
./README.qmd
Find all files in or below the current directory that start with two characters followed by -example
and end with any other number of characters:
./example/08-example.html
./example/04-example.qmd
./example/08-example.qmd
./example/09-example_files
./example/09-example.html
./example/01-example.qmd
./example/02-example.qmd
./example/04-example.html
./example/03-example.html
./example/07-example_files
./example/07-example.qmd
./example/02-example.html
./example/05-example.html
./example/09-example.qmd
./example/07-example.html
./example/01-example.html
./example/06-example.html
./example/05-example.qmd
./example/06-example.qmd
./example/03-example.qmd
./_freeze/example/03-example
./_freeze/example/02-example
./_freeze/example/09-example
./_freeze/example/08-example
./_freeze/example/01-example
./_freeze/example/04-example
./_freeze/example/05-example
./_freeze/example/07-example
./_freeze/example/06-example
./.quarto/idx/example/08-example.qmd.json
./.quarto/idx/example/03-example.qmd.json
./.quarto/idx/example/07-example.qmd.json
./.quarto/idx/example/09-example.qmd.json
./.quarto/idx/example/02-example.qmd.json
./.quarto/idx/example/06-example.qmd.json
./.quarto/idx/example/01-example.qmd.json
./.quarto/idx/example/05-example.qmd.json
./.quarto/idx/example/04-example.qmd.json
./.quarto/_freeze/example/03-example
./.quarto/_freeze/example/02-example
./.quarto/_freeze/example/09-example
./.quarto/_freeze/example/08-example
./.quarto/_freeze/example/01-example
./.quarto/_freeze/example/04-example
./.quarto/_freeze/example/05-example
./.quarto/_freeze/example/07-example
./.quarto/_freeze/example/06-example
See how these sort:
1.txt
10.txt
11.txt
12.txt
13.txt
14.txt
15.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt
Not what we want.
total 0
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 a01.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 a02.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 a03.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 b01.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 b02.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 b03.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 c01.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 c02.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 c03.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 d01.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 d02.txt
-rw-r--r--@ 1 kjhealy staff 0 Nov 5 10:57 d03.txt
In general keep your names lower-case.
total 408
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 2023 R
-rw-r--r--@ 1 kjhealy staff 1967 Sep 18 2023 README.md
-rw-r--r-- 1 kjhealy staff 1764 Jan 23 2024 README.qmd
drwxr-xr-x@ 3 kjhealy staff 96 Sep 18 2023 README_files
drwxr-xr-x 3 kjhealy staff 96 Jan 9 2024 _extensions
drwxr-xr-x@ 10 kjhealy staff 320 Nov 5 10:57 _freeze
-rw-r--r--@ 1 kjhealy staff 4750 Nov 5 10:34 _quarto.yml
drwxr-xr-x@ 2 kjhealy staff 64 Nov 5 10:56 _site
drwxr-xr-x@ 7 kjhealy staff 224 Nov 5 10:54 _targets
-rw-r--r--@ 1 kjhealy staff 2737 Sep 27 17:40 _targets.R
-rw-r--r--@ 1 kjhealy staff 997 Aug 26 14:40 _variables.yml
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 about
drwxr-xr-x@ 16 kjhealy staff 512 Nov 5 10:12 assets
drwxr-xr-x@ 22 kjhealy staff 704 Nov 5 10:57 assignment
lrwxr-xr-x 1 kjhealy staff 135 Nov 5 10:23 avhrr -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 content
drwxr-xr-x@ 5 kjhealy staff 160 Nov 5 10:14 data
-rwxr-xr-x@ 1 kjhealy staff 437 Oct 23 08:19 deploy.sh
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 example
drwxr-xr-x@ 12 kjhealy staff 384 Nov 5 10:57 files
drwxr-xr-x 14 kjhealy staff 448 Jan 8 2024 html
-rw-r--r--@ 1 kjhealy staff 50673 Nov 5 10:57 index.html
-rw-r--r--@ 1 kjhealy staff 6937 Oct 23 10:59 index.qmd
-rw-r--r--@ 1 kjhealy staff 258 Oct 29 17:52 mptc.Rproj
drwxr-xr-x@ 7 kjhealy staff 224 Aug 15 2023 renv
-rw-r--r--@ 1 kjhealy staff 63998 Nov 4 08:04 renv.lock
-rw-r--r-- 1 kjhealy staff 46717 Dec 11 2023 renv.lock.orig
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 schedule
lrwxr-xr-x 1 kjhealy staff 66 Nov 5 10:23 seas -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/World_Seas_IHO_v3
drwxr-xr-x@ 11 kjhealy staff 352 Nov 5 10:57 site_libs
drwxr-xr-x@ 21 kjhealy staff 672 Nov 5 10:57 slides
drwxr-xr-x 6 kjhealy staff 192 Aug 28 09:24 staging
drwxr-xr-x 3 kjhealy staff 96 Nov 5 10:54 syllabus
total 504
drwxr-xr-x@ 44 kjhealy staff 1408 Nov 5 10:57 .
drwxr-xr-x@ 31 kjhealy staff 992 Oct 6 09:28 ..
-rw-r--r--@ 1 kjhealy staff 10244 Sep 3 13:23 .DS_Store
-rw-r--r--@ 1 kjhealy staff 17417 Oct 31 15:07 .Rhistory
-rw-r--r--@ 1 kjhealy staff 26 Aug 15 2023 .Rprofile
drwxr-xr-x@ 4 kjhealy staff 128 Aug 10 2023 .Rproj.user
drwxr-xr-x@ 16 kjhealy staff 512 Nov 5 10:42 .git
-rw-r--r--@ 1 kjhealy staff 346 Oct 8 11:21 .gitignore
-rw-r--r-- 1 kjhealy staff 71 Jan 9 2024 .gitmodules
-rw-r--r--@ 1 kjhealy staff 821 Aug 16 2023 .luarc.json
drwxr-xr-x@ 6 kjhealy staff 192 Nov 5 10:56 .quarto
drwxr-xr-x@ 8 kjhealy staff 256 Aug 15 2023 R
-rw-r--r--@ 1 kjhealy staff 1967 Sep 18 2023 README.md
-rw-r--r-- 1 kjhealy staff 1764 Jan 23 2024 README.qmd
drwxr-xr-x@ 3 kjhealy staff 96 Sep 18 2023 README_files
drwxr-xr-x 3 kjhealy staff 96 Jan 9 2024 _extensions
drwxr-xr-x@ 10 kjhealy staff 320 Nov 5 10:57 _freeze
-rw-r--r--@ 1 kjhealy staff 4750 Nov 5 10:34 _quarto.yml
drwxr-xr-x@ 2 kjhealy staff 64 Nov 5 10:56 _site
drwxr-xr-x@ 7 kjhealy staff 224 Nov 5 10:54 _targets
-rw-r--r--@ 1 kjhealy staff 2737 Sep 27 17:40 _targets.R
-rw-r--r--@ 1 kjhealy staff 997 Aug 26 14:40 _variables.yml
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 about
drwxr-xr-x@ 16 kjhealy staff 512 Nov 5 10:12 assets
drwxr-xr-x@ 22 kjhealy staff 704 Nov 5 10:57 assignment
lrwxr-xr-x 1 kjhealy staff 135 Nov 5 10:23 avhrr -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 content
drwxr-xr-x@ 5 kjhealy staff 160 Nov 5 10:14 data
-rwxr-xr-x@ 1 kjhealy staff 437 Oct 23 08:19 deploy.sh
drwxr-xr-x@ 24 kjhealy staff 768 Nov 5 10:57 example
drwxr-xr-x@ 12 kjhealy staff 384 Nov 5 10:57 files
drwxr-xr-x 14 kjhealy staff 448 Jan 8 2024 html
-rw-r--r--@ 1 kjhealy staff 50673 Nov 5 10:57 index.html
-rw-r--r--@ 1 kjhealy staff 6937 Oct 23 10:59 index.qmd
-rw-r--r--@ 1 kjhealy staff 258 Oct 29 17:52 mptc.Rproj
drwxr-xr-x@ 7 kjhealy staff 224 Aug 15 2023 renv
-rw-r--r--@ 1 kjhealy staff 63998 Nov 4 08:04 renv.lock
-rw-r--r-- 1 kjhealy staff 46717 Dec 11 2023 renv.lock.orig
drwxr-xr-x@ 4 kjhealy staff 128 Nov 5 10:57 schedule
lrwxr-xr-x 1 kjhealy staff 66 Nov 5 10:23 seas -> /Users/kjhealy/Documents/data/misc/noaa_ncei/raw/World_Seas_IHO_v3
drwxr-xr-x@ 11 kjhealy staff 352 Nov 5 10:57 site_libs
drwxr-xr-x@ 21 kjhealy staff 672 Nov 5 10:57 slides
drwxr-xr-x 6 kjhealy staff 192 Aug 28 09:24 staging
drwxr-xr-x 3 kjhealy staff 96 Nov 5 10:54 syllabus
.
, are “hidden”ls
_
, are often “generated” (though this is a weak convention)Here’s the .gitignore
file for this project:
.Rproj.user
.Rhistory
.RData
.Ruserdata
/.quarto/
/_site/
/renv/
/_freeze/
/_targets/
about/*.pdf
about/*.html
assignment/*.html
example/*.html
schedule/*.html
syllabus/*.html
data/dfstrat.csv
slides/*.pdf
slides/*.html
slides/**/*_cache/*
slides/libs/*
projects/*.zip
seas
avhrr
# knitr and caching
**/*_files/*
**/*_cache/*
README.html
/.luarc.json
A .bashrc
file to configure non-login shells for Bash:
# Put the contents of this file in your ~/.bashrc file
# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples
# If not running interactively, don't do anything
case $- in
*i*) ;;
*) return;;
esac
# don't put duplicate lines or lines starting with space in the history.
# See bash(1) for more options
HISTCONTROL=ignoreboth
# append to the history file, don't overwrite it
shopt -s histappend
# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)
HISTSIZE=1000
HISTFILESIZE=2000
# check the window size after each command and, if necessary,
# update the values of LINES and COLUMNS.
shopt -s checkwinsize
# If set, the pattern "**" used in a pathname expansion context will
# match all files and zero or more directories and subdirectories.
#shopt -s globstar
# make less more friendly for non-text input files, see lesspipe(1)
#[ -x /usr/bin/lesspipe ] && eval "$(SHELL=/bin/sh lesspipe)"
# set variable identifying the chroot you work in (used in the prompt below)
if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then
debian_chroot=$(cat /etc/debian_chroot)
fi
# set a fancy prompt (non-color, unless we know we "want" color)
case "$TERM" in
xterm-color) color_prompt=yes;;
esac
# uncomment for a colored prompt, if the terminal has the capability; turned
# off by default to not distract the user: the focus in a terminal window
# should be on the output of commands, not on the prompt
force_color_prompt=yes
if [ -n "$force_color_prompt" ]; then
if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then
# We have color support; assume it's compliant with Ecma-48
# (ISO/IEC-6429). (Lack of such support is extremely rare, and such
# a case would tend to support setf rather than setaf.)
color_prompt=yes
else
color_prompt=
fi
fi
if [ "$color_prompt" = yes ]; then
# PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\H\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\] \$ '
else
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
fi
unset color_prompt force_color_prompt
# If this is an xterm set the title to user@host:dir
case "$TERM" in
xterm*|rxvt*)
PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
;;
*)
;;
esac
# enable color support of ls and also add handy aliases
if [ -x /usr/bin/dircolors ]; then
test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"
alias ls='ls --color=auto'
#alias dir='dir --color=auto'
#alias vdir='vdir --color=auto'
alias grep='grep --color=auto'
alias fgrep='fgrep --color=auto'
alias egrep='egrep --color=auto'
fi
# some more ls aliases
#alias ll='ls -l'
#alias la='ls -A'
#alias l='ls -CF'
# Alias definitions.
# You may want to put all your additions into a separate file like
# ~/.bash_aliases, instead of adding them here directly.
# See /usr/share/doc/bash-doc/examples in the bash-doc package.
if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi
# enable programmable completion features (you don't need to enable
# this, if it's already enabled in /etc/bash.bashrc and /etc/profile
# sources /etc/bash.bashrc).
if ! shopt -oq posix; then
if [ -f /usr/share/bash-completion/bash_completion ]; then
. /usr/share/bash-completion/bash_completion
elif [ -f /etc/bash_completion ]; then
. /etc/bash_completion
fi
fi
# Put the contents of this file in your ~/.zshrc file.
# Source: https://github.com/belak/zsh-utils?tab=readme-ov-file
[[ ! -d "$HOME/.antigen" ]] && git clone https://github.com/zsh-users/antigen.git "$HOME/.antigen"
source "$HOME/.antigen/antigen.zsh"
# Set the default plugin repo to be zsh-utils
antigen use belak/zsh-utils --branch=main
# Specify completions we want before the completion module
antigen bundle zsh-users/zsh-completions
# Specify plugins we want
antigen bundle editor@main
antigen bundle history@main
antigen bundle prompt@main
antigen bundle utility@main
antigen bundle completion@main
# Specify additional external plugins we want
antigen bundle zsh-users/zsh-syntax-highlighting
# Load everything
antigen apply
# Set any settings or overrides here
prompt belak
bindkey -e
Don’t blindly install things
Installing things via shell scripts should only be done from trusted sources!
ls
command again:We can send, or pipe, this output to another command, instead of to the terminal:
wc
command counts the number of words in a file, or in whatever is sent to it via STDIN
.-l
switch to wc
means ‘just count lines instead of words’Like with pipelines in R, we can compose sequences of actions at the prompt:
❯ head access.log
192.195.49.31 - - [27/Aug/2023:00:01:11 +0000] "GET / HTTP/1.1" 200 19219 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:12 +0000] "GET /libs/tufte-css-2015.12.29/tufte.css HTTP/1.1" 200 2025 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:12 +0000] "GET /libs/tufte-css-2015.12.29/envisioned.css HTTP/1.1" 200 888 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:12 +0000] "GET /css/tablesaw-stackonly.css HTTP/1.1" 200 1640 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:12 +0000] "GET /css/nudge.css HTTP/1.1" 200 1675 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:12 +0000] "GET /css/sourcesans.css HTTP/1.1" 200 1492 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:13 +0000] "GET /js/jquery.js HTTP/1.1" 200 30464 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:13 +0000] "GET /js/tablesaw-stackonly.js HTTP/1.1" 200 2996 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
192.195.49.31 - - [27/Aug/2023:00:01:13 +0000] "GET /js/nudge.min.js HTTP/1.1" 200 937 "https://socviz.co/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.54"
52.13.187.67 - - [27/Aug/2023:00:01:13 +0000] "GET /dataviz-pdfl_files/figure-html4/ch-03-fig-lexp-gdp-10-1.png HTTP/1.1" 200 308830 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0"
Like with pipelines in R, we can compose sequences of actions at the prompt:
Like with pipelines in R, we can compose sequences of actions at the prompt:
❯ awk '// {print $11}' access.log | sort | uniq -c | sort -nr | head -n 15
9729 "https://socviz.co/lookatdata.html"
4851 "-"
4212 "https://socviz.co/"
1719 "https://socviz.co/makeplot.html"
1477 "https://bookdown.org/"
1466 "https://socviz.co/gettingstarted.html"
1373 "https://socviz.co/groupfacettx.html"
864 "https://socviz.co/workgeoms.html"
794 "https://socviz.co/maps.html"
733 "https://socviz.co/refineplots.html"
671 "https://socviz.co/index.html"
349 "https://socviz.co/appendix.html"
228 "https://socviz.co/modeling.html"
153 "https://www.google.com/"
50 "http://vissoc.co/"
We can do a lot with a pipeline:
curl -s 'http://api.citybik.es/v2/networks/citi-bike-nyc' |
jq '.network.stations[].free_bikes' |
gpaste -sd+ |
bc
32517
This is the number of Citi Bikes available in New York City at the time these slides were made.
We usually won’t use the Unix command line or shell to things like this. We’ll do it in R. You could also do it in other languages. But basic shell competence remains extremely handy for many more common tasks.
#!
or “shebang” line saying where the interpreter ischmod 755 script.sh
or chmod +x script.sh
to make executableIn an era of Generative AI and LLMs, why are we covering this stuff?
Because Unix is still everywhere
As soon as you try to do anything of any sort of technical complexity, or just simple reproducibility, with your computer—even using the newest and coolest tools—I promise you’ll eventually find yourself in a world governed by the metaphors and methods Unix originated, and, very likely, in a literal Unix-derived environment.
That is, you will be in some sort of folder-based hierarchy; you will edit plain-text files in order to configure, launch, generate, or capture the output of applications; and you will do this by way of instructions written down as a series of commands that follow some sort of regular syntax. The details of those instructions (and the particular conventions they use) will vary depending on the task at hand. But in essence you will always be doing the same thing.