![]() |
![]() |
||||
| What's happening | |||||
bash parse apache log file
to parse apache log in the following format
<pre> 65.160.238.180 - - [04/Nov/2006:05:24:24 -0800] "GET /code/pcpaper.html HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible;)" </pre> use bash script <pre> cat access_061105.log | \ sed 's/\[//' | sed 's/\]//' | \ sed 's/,/\{comma\}/g' | \ awk -F '\"' '{ ca=split($1, A, " "); \ printf "%d,%s,%s,%s,%s,%s,%d", NF,A[1],A[2],A[3],A[4],A[5],ca; \ gsub(" ", ",", $2); gsub(" ", ",", $3); \ printf(",%s%s%s,%s,%s",$2,$3,$4,$6,$8); printf("\n") }' </pre> output becomes comma separated and ready for mysql <pre> fields 1 - should be 7 meaning there were 7 fields separated by " in the original record, changed to 9 after 4/1/2007 to add new field at the end 2 - ip 3,4 - - 5 - date and time 6 - timezone offset 7 - should be 5, a verification field 8 - http action 9 - url 10 - http version of client 11 - server http return code 12 - bytes returned 13 - referrer 14 - user agent new fields after 4/1/2007 15- host name 7,81.19.66.8,-,-,05/Nov/2006:03:59:21,-0800,5,GET,/robots.txt,HTTP/1.0,404,371,- ,StackRambler/2.0 (MSIE incompatible), 9,220.181.19.159,-,-,29/Feb/2008:15:58:54,-0800,5,GET,/webdoc/index.html,HTTP/1. 1,200,1502,-,Sogou web spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm #07),www.bayimage.com </pre> 2008-03-28 16:50:32 GMT
|
|||||
