こんにちは,kriwです。今日はシェル芸について紹介します。
これはrogyAdventCalendar5日目の記事です。traPAdventCalendarとは関係ありません。
シェル芸って何?
シェル芸はコマンドライン上で全ての処理を完結させてしまうような芸です。多くはワンライナーのものを指して言います。
実行環境
Max OSX, zshです。
シェル芸の準備
Linuxを使用している方はそのままで大丈夫だと思います。
OSXの方はGNU Core Utilsをインストールすることを推奨します。(OSXはBSDのコマンドを使っているためLinuxと勝手が違う)
Windowsの人は仮想環境を入れるなりBash on Ubuntu on Windowsで頑張ってみてください。
Core Utilsをインストール(OSXの方)
brew install coreutils
これでいつも使っているコマンドの頭にgを付け足せばok(ex sed -> gsed)
やってみよう
今回は CSVファイルの処理をしてみます。
こちらのデータを使用しました。
貼り付けたコードが(ワンライナーなので)かなり横に長くなってしまったのですごく見辛いです。ごめんなさい。
level1
まずはダウンロード
wget "http://gs.statcounter.com/chart.php?device=Desktop%2C%20Tablet%20%26%20Console&device_hidden=desktop%2Btablet%2Bconsole&multi-device=true&statType_hidden=browser&region_hidden=ww&granularity=monthly&statType=Browser&region=Worldwide&fromInt=201510&toInt=201611&fromMonthYear=2015-10&toMonthYear=2016-11&csv=1" -O browser_data.csv
内容を確認
cat browser_data.csv
こんな感じ
"Date","Chrome","Firefox","IE","Safari","Edge","Opera","Android","Yandex Browser","Coc Coc","UC Browser","Maxthon","360 Safe Browser","Sogou Explorer","Chromium","Sony PS4","QQ Browser","Phantom","Silk","Mozilla","Puffin","Pale Moon","SeaMonkey","Sony PS3","Iceweasel","Amigo","Iron","Vivaldi","Microsoft-WebDAV","NetFront NX","Other"
2015-10,53.62,15.53,15.38,9.13,1.1,1.83,1.34,0.57,0.12,0.13,0.19,0.19,0.13,0.16,0.06,0.1,0.04,0.09,0.03,0.04,0.02,0.03,0.03,0.03,0.02,0.02,0,0,0.01,0.08
2015-11,54.13,14.72,15.46,9.37,1.21,1.81,1.28,0.53,0.13,0.11,0.19,0.18,0.13,0.14,0.06,0.1,0.04,0.09,0.03,0.04,0.02,0.03,0.03,0.03,0.02,0.02,0,0,0.01,0.08
2015-12,53.56,14.3,15.19,9.93,1.47,2.12,1.3,0.5,0.16,0.12,0.2,0.25,0.14,0.13,0.07,0.1,0.04,0.07,0.03,0.03,0.03,0.03,0.03,0.03,0.02,0.04,0.01,0,0.01,0.08
2016-01,54.2,14.61,14.62,9.47,1.69,1.96,1.39,0.49,0.16,0.12,0.19,0.19,0.15,0.13,0.08,0.09,0.04,0.05,0.03,0.03,0.03,0.05,0.03,0.03,0.02,0.04,0.01,0,0.01,0.08
2016-02,55.33,14.67,13.38,9.46,1.84,2,1.39,0.48,0.12,0.13,0.19,0.15,0.14,0.14,0.1,0.09,0.06,0.04,0.04,0.03,0.03,0.03,0.02,0.03,0.02,0.01,0.01,0,0.01,0.07
2016-03,56.4,14.31,12.52,9.47,1.99,1.91,1.39,0.51,0.2,0.15,0.19,0.16,0.15,0.13,0.1,0.09,0.04,0.02,0.03,0.04,0.02,0.03,0.02,0.03,0.02,0,0.01,0,0.01,0.07
2016-04,56.75,14.24,12.14,9.47,2.11,1.87,1.36,0.46,0.19,0.16,0.18,0.22,0.14,0.14,0.1,0.1,0.05,0.02,0.03,0.04,0.02,0.03,0.02,0.03,0.02,0,0.01,0.01,0.01,0.07
2016-05,56.94,14.52,11.38,9.69,2.3,1.83,1.35,0.44,0.19,0.17,0.18,0.14,0.15,0.13,0.13,0.1,0.04,0.01,0.04,0.03,0.02,0.02,0.02,0.03,0.02,0,0.02,0.03,0.01,0.07
2016-06,57.89,14.16,10.71,9.64,2.54,1.72,1.36,0.4,0.21,0.18,0.18,0.13,0.15,0.13,0.13,0.1,0.04,0.01,0.03,0.04,0.03,0.02,0.02,0.02,0.02,0,0.02,0.02,0.01,0.07
2016-07,58.26,13.97,9.77,9.74,2.79,1.77,1.48,0.43,0.3,0.28,0.17,0.13,0.16,0.14,0.11,0.12,0.05,0.02,0.03,0.04,0.03,0.02,0.02,0.02,0.02,0,0.02,0.03,0.01,0.07
2016-08,58.37,13.92,9.8,9.61,2.87,1.78,1.38,0.43,0.32,0.29,0.17,0.15,0.16,0.14,0.13,0.11,0.04,0.01,0.03,0.04,0.03,0.02,0.02,0.01,0.02,0.01,0.02,0.03,0.01,0.07
2016-09,58.75,13.67,9.82,9.63,2.78,1.73,1.31,0.46,0.35,0.28,0.16,0.17,0.15,0.14,0.13,0.11,0.03,0.01,0.04,0.04,0.03,0.02,0.02,0.01,0.02,0.01,0.02,0.02,0.01,0.07
2016-10,59.24,13.29,8.9,10.23,2.83,1.94,1.29,0.45,0.34,0.32,0.17,0.11,0.14,0.15,0.12,0.1,0.03,0.01,0.04,0.04,0.03,0.02,0.02,0.01,0.02,0.02,0.02,0.03,0.01,0.08
2016-11,59.05,13.49,8.8,10.38,2.93,1.84,1.28,0.45,0.34,0.29,0.15,0.11,0.12,0.17,0.13,0.09,0.04,0.01,0.05,0.04,0.03,0.02,0.02,0.01,0.02,0.01,0.02,0.02,0.01,0.07
level2
n(=4)行目のデータを出力 (OSXはsedではなくgsedを使ってください)
cat browser_data.csv | head -n 4 | tail -n 1 | cut -d, -f2- | sed 's/,/\n/g' | awk '{sum+=$1}END{print sum}'
出力
99.99
これでcat browser_data.csvの出力の4行目を出力できる。
パイプで区切って考えると
head -n 4は入力の4行目以降を出力する。
tail -n 1は入力の最後から1行目以降を出力する。
**cut -d, -f2-**は,
を区切り文字にして第二要素以降を出力する。
**sed 's,/\n/g'**は全ての,
を改行文字に置換する。 macではgsed
を使った方が良い
awkは各行を読み取ってその和を計算して、最後にそれを出力している。
という感じです。
level3
各列の平均値を計算
cat browser_data.csv | cut -d, -f2- | awk -F, 'NR>1{for(i=1;i<=NF;i++)hoge[i]+=$i}END{for(i=1;i<=NF;i++)print hoge[i]/(NR-1)}'
出力
56.6071
14.2429
11.9886
9.65929
2.17571
1.86571
1.35071
0.472143
0.223571
0.195
0.179286
0.162857
0.143571
0.140714
0.103571
0.1
0.0414286
0.0328571
0.0342857
0.0371429
0.0264286
0.0264286
0.0228571
0.0228571
0.02
0.0128571
0.0135714
0.0135714
0.01
0.0735714
上から順番にChrome,Firfox,IE,...となっている。
awkはfor文も使えるので強い.
level4
ブラウザ名と平均値を一緒に出力する。
cat browser_data.csv | cut -d, -f2- | awk -F, 'NR==1{for(i=1;i<=NF;i++)browser[i]=$i}NR>1{for(i=1;i<=NF;i++)hoge[i]+=$i}END{for(i=1;i<=NF;i++)print browser[i] ": " hoge[i]/(NR-1)}'
出力
"Chrome": 56.6071
"Firefox": 14.2429
"IE": 11.9886
"Safari": 9.65929
"Edge": 2.17571
"Opera": 1.86571
"Android": 1.35071
"Yandex Browser": 0.472143
"Coc Coc": 0.223571
"UC Browser": 0.195
"Maxthon": 0.179286
"360 Safe Browser": 0.162857
"Sogou Explorer": 0.143571
"Chromium": 0.140714
"Sony PS4": 0.103571
"QQ Browser": 0.1
"Phantom": 0.0414286
"Silk": 0.0328571
"Mozilla": 0.0342857
"Puffin": 0.0371429
"Pale Moon": 0.0264286
"SeaMonkey": 0.0264286
"Sony PS3": 0.0228571
"Iceweasel": 0.0228571
"Amigo": 0.02
"Iron": 0.0128571
"Vivaldi": 0.0135714
"Microsoft-WebDAV": 0.0135714
"NetFront NX": 0.01
"Other": 0.0735714
awkの連想配列も使用してみました。
かなり読みやすくなった。level3のawkの1行目に対する処理を付け足して最終処理で平均値と一緒にブラウザ名を出力させました。
最後に
今回書いたのはテキスト処理についてですが、画像処理やexcelのシートだって編集できます。
シェル芸をやっていると面倒な処理を簡単にこなせてしまうのでオススメします。
ちなみに
コマンドラインを?にする方法(OS X限定)
bash
PS1="\w \! ?"
zsh
PROMPT="%{$fg[yellow]$bg[black]%}%~ ?"