November 15, 2020

Correct English

§ quotes     # english

Even if you do learn to speak correct English, whom are you going to speak it to?

Clarence Darrow

October 31, 2020

Producing histograms in terminal

§ tech     # cli histograms awk gnuplot

Quite often I end up with the need to quickly visualize information in a terminal as a histogram or a chart of some sort. Here are three methods I use most often.

Gnuplot

The first thing that comes to mind when it comes to charts is gnuplot – a versatile tool to produce all kinds of charts and graphs. The cool feature of it is the support for dumb terminals so you can easily have charts like this:

                                       ping google.com
     55 +---------------------------------------------------------------------------+
        |       +        +       +        +       +        +       +        +       |
        |                                                                           |
     50 |-+                                                                 *     +-|
        |                                                                   *       |
        |                                         *                        * *      |
        |                                        * *                       * *      |
     45 |-+                                      *  *                     *   *   +-|
        |                                       *   *                     *   *     |
        |                                       *    *                   *     *    |
     40 |-+                                    *      *                  *     *  +-|
        |                                      *       *                *       *   |
 ms     |                                     *         *               *       *   |
     35 |-+                                   *          *             *        * +-|
        |                                    *           *             *         *  |
        |                                    *            *           *          *  |
     30 |-+            *****                *              ***        *           *-|
        |          ****     **              *                 *      *            * |
        |**********           **           *                   **    *             *|
        |                       *          *                     ** *              *|
     25 |-+                      **********                         *             +-|
        |                                                          *                |
        |       +        +       +        +       +        +       +        +       |
     20 +---------------------------------------------------------------------------+
        0       1        2       3        4       5        6       7        8       9
                                            count

I’ve generated this graph with the following command:

$ ping -c 10 google.com -i 0.2 | awk '/time=/{ print $(NF-1) }' | cut -d= -f2 | \
  gnuplot -e \
  "set terminal dumb size 90, 30; set autoscale; set title 'ping google.com';
   set ylabel 'ms'; set xlabel 'count'; plot '-'  with lines notitle";

A little bit more useful example:

$ sar | awk '/^(09|10)/{print substr($1,1,5), $4}' | \
  gnuplot -e \
  "set terminal dumb size 90, 30; set title '% CPU User'; set autoscale; 
   plot '-' using 2:xtic(1) with lines notitle";

                                       % CPU User
  18 +------------------------------------------------------------------------------+
     |      +      +       +      +      +      +      *******+       +      +      |
     |                                                *       ***                   |
  16 |-+                                              *          *                +-|
     |                                               *            **                |
  14 |-+                                            *               **            +-|
     |                                              *                               |
     |                                             *                  ***           |
  12 |-+                                          *                      *        +-|
     |                                           *                        *         |
     |                                           *                         **       |
  10 |-+                                       **                                 +-|
     |                                       **                              *******|
   8 |-+                                   **                                     +-|
     |                                  ***                                         |
     |                              ****                                            |
   6 |-+                          **                                              +-|
     |                           *                                                  |
     |                          *                                                   |
   4 |-+                       *                                                  +-|
     |                         *                                                    |
   2 |-+                      *                                                   +-|
     |                       *                                                      |
     |*********************+*     +      +      +      +      +       +      +      |
   0 +------------------------------------------------------------------------------+
   09:00  09:10  09:20   09:30  09:40  09:50  10:00  10:10  10:20   10:30  10:40  10:50

Important difference here is that we’re using xtic for x tic labels. Let’s look at the input data sample:

$ sar | awk '/^(09|10)/{print substr($1,1,5), $4}'
09:00 0.44
09:10 0.44
09:20 0.41
09:30 0.37
09:40 6.37
09:50 7.35
10:00 9.58
10:10 17.54
10:20 16.42
10:30 12.82
10:40 9.33
10:50 8.85

Essentially, we have x tic labels in column one and the actual data in column two. We’re letting gnuplot know about this with using 2:xtic(1) instruction.


The key is to remember (or write down) this skeleton command:

gnuplot -e "set term dumb 120, 30; set autoscale; plot '-' with lines notitle"

It’s enough to draw a simple chart with a single data series. To account for xtic labels you just add using 2:xtic(1) to plot instruction.

Finally, for histogram data use set boxwidth 0.2; plot ... with boxes.

perl one-liner

gnuplot is an incredibly powerful tool, but often I find myself looking for a quick and dirty histogram-like representation for data. I don’t want go about fetching data from remote system to process it with gnuplot on my laptop, nor am I willing to install gnuplot on the remote system. In such situations I refer to a very old perl-based approach, thanks to the fact that perl is installed almost everywhere.

Let’s return to the above example with CPU usage information from sar utility. Taking in the same input, we can do this:

$ sar | awk '/^(09|10)/{print substr($1,1,5), $4}' | \
  perl -lane 'print $F[0], "\t", "=" x $F[1]'
09:00
09:10
09:20
09:30
09:40	======
09:50	=======
10:00	=========
10:10	=================
10:20	================
10:30	============
10:40	=========
10:50	========

It’s amazing how simple this command is for the results it produces! When the raw data numbers are to big for the terminal width, it’s easy to add scaling with ($F[1]/<scale_factor>), for example:

perl -lane 'print $F[0], "\t", "=" x ($F[1]/5)'

Sparklines

The third method for generating histograms and charts in terminal that I wanted to mention is sparklines. I first discovered it when I learnt about spark tool. It’s a simple bash script that you can use to generate Tufte’s sparklines.

Here’s an example (earthquakes and their magnitudes in the last 24 hours):

$ curl -s https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.csv | \
  sed '1d' | \
  cut -d, -f5 | \
  spark
▃█▅▅█▅▃▃▅█▃▃▁▅▅▃▃▅▁▁▃▃▃▃▃▅▃█▅▁▃▅▃█▃▁

October 16, 2020

Setting up autossh autostart with systemd

§ tech howto     # autossh systemd

Just a quick note on setting up autossh on system’s startup. I use it to proxy-forward traffic from the internet exposed host to a firewalled host inside a private network. This way all the data and apps stay on-prem but are available to external users if needed.

autossh advantage is that it restart ssh in case connection breaks for some reason. It’s important to configure it in a way so that it can detect such breakdowns. For non-critical services, I specify the following options:

-o "ExitOnForwardFailure=yes" -o "ServerAliveInterval 30" \ 
-o "ServerAliveCountMax 3"

That makes autossh detect issues within 2 minutes – enough for my purposes. The rest of parameters I provide are disabling autossh monitoring mechanism (-M 0 because it’s not very reliable), sending it to the background (-f, if running from command line) and the standard ones to set up ssh tunnel. Here’s an example:

autossh -M 0 -f -o "ExitOnForwardFailure=yes" \
                -o "ServerAliveInterval 30" \
                -o "ServerAliveCountMax 3" \
                -NR 8088:127.0.0.1:80 -i <ssh_key> user@host

To get this command execute on system’s boot, we need to create a simple systemd service file /etc/systemd/system/autossh-<host>-<service/port>.service:

[Unit]
Description=Keeps a tunnel to <host> for <service/port> open
After=network.target

[Service]
User=<user>
ExecStart=/usr/bin/autossh -M 0 -o "ExitOnForwardFailure=yes" \
                                -o "ServerAliveInterval 30" \
                                -o "ServerAliveCountMax 3" \
                                -NR 8088:127.0.0.1:80 \
                                -i <ssh-key>
                                user@host

[Install]
WantedBy=multi-user.target

and activate it: systemctl enable autossh-<host>-<service/port>.


October 5, 2020

Fixing mosh failing on MacOS

§ tech     # macos mosh locale terminal iterm console

TLDR

Problem: trying to connect to a remote server with mosh you see error messages like:

mosh-client needs a UTF-8 native locale to run.
mosh-server needs a UTF-8 native locale to run.
The locale requested by LC_CTYPE=UTF-8 isn't available here.
locale: Cannot set LC_CTYPE to default locale: No such file or directory

OR when connecting with ssh from iTerm2 you get the error message

-bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory

Solution: If you are not in US/GB/CA/IE/AU/NZ, add the following line to your .profile and restart the shell.

export LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8

I am quite a user of mosh – a mobile shell that makes it really convenient to work via ssh over unstable Internet connection or when roaming across different wifi networks. It’s one of the first tools I install on a new server and from then on I switch from ssh to mosh.

This weekend I was doing an initial configuration to a new server I provisioned from OVH hosting. To my surprise, mosh failed to connect with the following error message:

$ mosh myserver.at.ovh

mosh-client needs a UTF-8 native locale to run.

Unfortunately, the client's environment ([no charset variables]) specifies
the character set "US-ASCII".

LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
$ 

This was quite a surprise as surely the terminal in MacOS has a UTF-8 native locale. I changed to iTerm2 and tried connecting again, and again ended up with the similar looking error:

$ mosh myserver.at.ovh
/etc/profile.d/lang.sh: line 19: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
The locale requested by LC_CTYPE=UTF-8 isn't available here.
Running `locale-gen UTF-8' may be necessary.

The locale requested by LC_CTYPE=UTF-8 isn't available here.
Running `locale-gen UTF-8' may be necessary.

mosh-server needs a UTF-8 native locale to run.

Unfortunately, the local environment (LC_CTYPE=UTF-8) specifies
the character set "US-ASCII",

The client-supplied environment (LC_CTYPE=UTF-8) specifies
the character set "US-ASCII".

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=
LC_CTYPE=UTF-8
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
Connection to myserver.at.ovh closed.
/usr/local/bin/mosh: Did not find mosh server startup message. (Have you installed mosh on your server?)
$

Apparently, something was wrong my with locale. Here I realized I have recently done a complete clean re-install and never restored whatever locale settings I had in my .bash_profile.

This actually only increased my curiosity: a freshly installed MacOS with still almost default configuration was causing mosh to fail on connection over locale settings. I googled a bit the mosh error message but didn’t find anything illuminating; then I noticed that when I was connecting to the server with ssh, iTerm2 would emit the following error message:

$ ssh myserver.at.ovh
Last login: Mon Oct  5 11:00:30 2020 from 178.136.74.184
-bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory

Interestingly, this message wasn’t showing up if I was running ssh from Terminal. So I googled this message in connection with iterm2 and found the issue #5478 with the comment showing an excerpt from iterm2 debug log:

getLocale: languageCode=en, countryCode=ZA
Tentative locale is en_ZA.UTF-8
Locale is NOT supported
Set LC_CTYPE=UTF-8

Now this was illuminating! iTerm2 calculates the value for LC_CTYPE variable by getting the language and country codes from the MacOS. When it ends up with a combination which MacOS lacks locale definition for, it falls back to the default LC_CTYPE=UTF-8. Terminal just defaults to LC_CTYPE=C which is non-UTF8!

Let’s see what English locales MacOS does have the support for:

$ cd /usr/share/locale/
$ ls -1d en_* | column -c 80
en_AU                   en_CA.UTF-8             en_NZ.ISO8859-1
en_AU.ISO8859-1         en_GB                   en_NZ.ISO8859-15
en_AU.ISO8859-15        en_GB.ISO8859-1         en_NZ.US-ASCII
en_AU.US-ASCII          en_GB.ISO8859-15        en_NZ.UTF-8
en_AU.UTF-8             en_GB.US-ASCII          en_US
en_CA                   en_GB.UTF-8             en_US.ISO8859-1
en_CA.ISO8859-1         en_IE                   en_US.ISO8859-15
en_CA.ISO8859-15        en_IE.UTF-8             en_US.US-ASCII
en_CA.US-ASCII          en_NZ                   en_US.UTF-8

Aha, not that many countries are supported, just the Five Eyes and Ireland! That explains why that person from South Africa ran into this issue with en_ZA and iTerm2: every combination of English as the OS language with an odd country will result in it. In my situation, the country was Ukraine with the country code UA, and of course there is no en_UA locale.

How does this affect the remote side of mosh connection – the server I am connecting to, though? Well, ssh – and mosh uses it for authentication phase – by default forwards the LANG and LC_* environment variables to the remote server it is connecting to:

$ grep -n -B1 SendEnv /etc/ssh/ssh_config
48-Host *
49:     SendEnv LANG LC_*

This makes the remote side of the connection declare them, and then mosh-server refuses to start because it requires a UTF-8 native locale to run. When I was connecting from Terminal, the locale was set to C which is not UTF-8, as I mentioned above; when I was connecting from iTerm2, the locale was defaulted to UTF-8 and this also irked mosh-server because there was no definition for it in the system.


How to fix it?

I could generate en_UA.UTF-8 locale definition locally, and that would make terminals define correctly LC_* variables, but they will send them over to the servers I am connecting to, and there mosh-server (and probably some other commands) will fail. This is no good.

Generating en_UA.UTF-8 locale definitions on all the servers I am connecting to is out of question too. Just like declaring overriding LANG/LC_* variables in .bash_profile files on all the servers: this might break legitimate clients with non en_UA.UTF-8 locales.

First working solution would be to disable LANG/LC_* variable forwarding in ssh configuration. Unfortunately, it is set in system-wide /etc/ssh/ssh_config and cannot be overridden in per-user ~/.ssh/config. I hate changing system-wide default setting with no good reason, and the problem with locale didn’t seem to me as a good reason to warrant such change.

Therefore I have opted for another option: just override LANG/LC_* variables in my user’s .bash_profile. This is a more klugy option, but in my situation it looks as the most reasonable thing to do. I have just added the following line to my ~/.bash_profile and it resolved the issue completely:

export LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8

September 22, 2020

Latency numbers every engineer should know

§ tech     # latency numbers

I have been re-reading Tom Limoncelli’s et al. The Practice of Cloud System Administration book (which is great and is well worth reading even 5 years after it has been published) and it has this wonderful reference table called Latency Numbers Every Engineer/Programmer Should Know. This table was popularized by Jeff Dean, and originally presented by Peter Norvig.

I find it handy and wanted to copy it here, on my blog, but then realized it’s almost a decade old as the data was from 2012. This made me do a little research (read: googling) to see how the decade affected the numbers listed. The answer: not much! The only noticeable changes are in disk and network performance thanks to better SSDs/NVMe and 10/100Gb networks.

Which googling though I discovered two interesting bits though: some nice representations of data, and a scaled similar systems latency table from Brendan Gregg’s Systems Performance book. Besides of nice scaling of the latency numbers, that table neatly arranges different subsystems by speed: cpu, memory, disk io, networking, and reboot times.

I have added a scaling columns to the original latency table and updated numbers a bit in both tables to reflect current situations and putting them below for reference.

Latency Numbers Every Engineer Should Know

3GHz CPU cycle                     0.3 |   1 s    |
L1 cache reference                 0.5 |   2 s    |    
Branch mispredict                  3   |  10 s    |
L2 cache reference                 5   |  20 s    | ~10x L1 cache
Mutex lock/unlock                ~20   |   1 min  |
Main memory reference            100   |   5 min  | 20x L2 cache, 200x L1 cache
Compress 1K with Zippy         2,000   |   2 h    |
Send 1K over 1Gbps network    10,000   |  10 h    | 10Gbps is 10x faster, duh
Read 4K randomly from SSD    150,000   |   6 days | ~1GB/sec SSD
Read 1 MB seq from memory    250,000   |  10 days |
Round trip within same DC    500,000   |  20 days |
Read 1 MB seq from SSD*    1,000,000   |   1 mo   | ~1GB/sec SSD, 4x memory
Disk seek                 10,000,000   |  10 mo   | 20x DC roundtrip
Read 1 MB seq from disk   20,000,000   |   2 yrs  | 80x memory, 20x SSD
Send pkt CA->NL->CA      150,000,000   |  12 yrs  |
                        |   |   | ns|  | ~scaled  |
                        |   | us|
                        | ms|

I have updated disk/net numbers with data from this handy site which provides reference historical performance data over the last few decades.

Example Time Scale of System Latencies

3GHz CPU cycle                  0.3 ns  |   1 s    |
L1 cache access                 0.5 ns  |   2 s    |    
L2 cache access                 2.8 ns  |   9 s    |    
L3 cache access                12.9 ns  |  43 s    |    
Main memory access            120   ns  |   6 min  | 
Solid-state disk IO       150,000   ns  |   6 days |
Rotational disk IO             10   ms  |  12 mo   |
Internet: SF to NY             40   ms  |   4 yrs  |
Internet: SF to UK             80   ms  |   8 yrs  |
Internet: SF to AU            185   ms  |  19 yrs  |
TCP packet retransmit           3   s   | 317 yrs  |
Container OS reboot             4   s   | 423 yrs  |
SCSI command time-out          30   s   |  3k yrs  |
VM reboot                      40   s   |  4k yrs  |
Physical system reboot          5   min | 32k yrs  |

References:


Navigation

Newer posts →

§ Categories

english fun general howto links poetry quotes talks tech

# Tags

air airline aldrich alphabet ansible apple ascii asciiart autossh awk bash bedford big-o bigdata bleep book bootstrap bugs burns cfengine chance cheatsheet cli clouds coe collectd3 colors colours conf console conway corporate coursera cpuset crc32 dependencies devops dilbert dns docker doctors dou education ejf emoji english europe excellence expect experience figlet firmware fraser fry fun ganalytics glissade gnuplot golang google gphotos graphdb grep highlighting histograms ifconfig ios iphone israel iterm k8s keyboard kids knockd latency lavabit less lgbt life links lisa load-testing loc locale logs lviv lxc macos maidan management marshak math media meetup mlodinov model monitoring mosh mysql nagios netcat nulp numbers oom osx outlook palette paper passenger pax peter phoenix-project photos pidtree poetry politics profit progressbar punctuation putty quality quotes random recording redhat refresher regex religion review rpm rpmdb russian sa scale schelling security segregation shell sher shortcuts signals sleep snowden solaris spinner ssl sunlight surge syntax systemd talks targets techtalks2012 ted terminal thinking time tor tracking travel troubleshooting twain ukraine ukrainian vi videos vim war wolfram writing yac

— `If you knew Time as well as I do,' said the Hatter, `you wouldn't talk about wasting IT. It's HIM.'
$ Last updated: Jan 17, 2021 at 20:57 (EET) $