Sunday, May 26, 2013

Don't Fear the Penguin!

So, you're interested in Linux. Awesome! There's a version of netflix for ubuntu now, and flash is working pretty darn well (in as much as flash works at all). There's support for most of the top-end graphics cards, and games work well under Linux now.

That pretty much eliminates the final few hurdles that a windows person could tout as something "holding them back" from moving to linux.

I expect we'll see a lot more people embracing linux over the next few years.

Let's explore the linux landscape.

If you're coming from Windows, you're moving from a (pretty-much) "single-user system" to a full multi-user system.

Windows doesn't have the idea of multiple people being logged into the windowing environment at the same time, each getting desktops and being able to run programs. At the same time.

Linux can do that - you can set it up so it'll run Xwindow programs which display remotely on other computers. (more on that below) And, multiple different (or the same) accounts can be logged in via different devices - terminals, virtual terminals, via ssh "network connections", etc.

Linux has the idea of "regular" users and a "superuser" - root. So you'll want to have a regular user that you use, and then escalate to root privilege only when something you're trying to do won't work without that extra privilege. That helps ensure (mostly) that whatever you do as a regular user won't break the computer.

You'll also want to have a secondary backup userID you can login as, for testing.

It's really handy, if you're having problems with a particular program, to be able to login as another user to see if that same problem exists for a "clean" user that hasn't been doing all of the other stuff your main user has been doing.

Next - let's talk about devices - drives, CDs, USB sticks, etc.

Linux doesn't expose devices like Windows does. You don't say C:/where/ever/

Instead, linux just has a "file system" directory tree, which starts up at / and works its way down, with directories inside directories.

The devices get mounted at "mount points" - effectively replacing the existing directory in the parent filesystem with the "/" directory of the mounted device.

An explanation is in order.

Say my main drive has:

/
|-- bin
|-- boot
|-- etc
|-- home
|-- lib
|-- lost+found
|-- media
|-- mnt
    `-- drive
|-- opt
|-- proc
|-- root
|-- sbin
|-- selinux
|-- srv
|-- sys
|-- tmp
|-- usr
`-- var

(yes the above is majorly simplified)

I might have a "data" drive with:

/
|-- photos
|-- music
|-- videos

I could setup a line in /etc/fstab to mount that data drive, or do it interactively:

mount /dev/sdb1 /mnt/drive

After I did that the drives contents would show up inside the "mount point":

/
|-- bin
[...]
|-- media
|-- mnt
    `-- drive
         `-- photos
         `-- music
         `-- videos
|-- opt
[...]
`-- var

Whether it's a secondary drive, or a CD, or a USB stick, or even a remote system's drive ( a network mount) - to get to the files on that device, you'll go through a mount point.

It's really a lot simpler in terms of how it exposes the files to end-users - it's simply always going to be /some/path/to/the/files/you/need/file1.txt

Here's a lot more detail on the Linux Filesystem: Linux Filesystem Hierarchy

Next - Run levels (click the link to the left to learn about them)

Just because your system's booted doesn't mean it'll be graphical. Review the runlevels and learn what they mean. There aren't many, and it's immensely useful to know how to get around problems if/when they happen, by switching to a different runlevel.

You don't need to memorize LFH or run level info - but it's totally worthwhile to understand what they are and remember they're there for when you need them.

There's more of course, but with the above, you have the basics for this new system that's in front of you. YES - it has a lot of switches and dials and knobs and buttons.

But - of course, you'll want to jump into using all the cool open source software that Linux has to offer, right?

Most of the stable open source stuff for Linux is available via "packages".

With a package manager such as 'yum' or 'synaptic' installing new packages on Linux is a breeze. It wasn't always easy... but it's now practically painless.

Your package manager may have a means of browsing the packages by category, and may show you descriptions, etc. - for example, the "Ubuntu Software Center" provides an outstanding interface to let you explore what it has to offer.

For starters, you'll want to stick with programs you can get "in channel" - via the package manager on your version of Linux. After you've mastered Linux, you can pull in open source software from almost anywhere, with enough patience, research, and willingness to experiment.

Check out the categories of programs that are available via your package manager. The Ubuntu Software Center has:

Accessories
Education
Fonts
Games
Graphics
Internet
Office
Science & Engineering
Sound & Video
Themes & Tweaks
Universal Access
Developer Tools
System

It also has a dozen odd "featured applications" - with a lot of cool stuff you might not have even thought about.

Let's look at browsers.

Chrome's still Chrome! In fact, if you have chrome hooked into a google account, it'll move your bookmarks and preferences and even your theme across from Windows to Linux. Your mileage may vary, but this worked seamlessly for me, and I wasn't even trying to make it work. It just did.

I can highly recommend that as a mechanism for transitioning onto linux.

Firefox... can get slow with lots of add-ins. So, keep it lean. I keep firebug turned off unless I absolutely need it. Most of the time I can figure out problems with chrome's F12 debugger, then once those are fixed everything just works right in Firefox.

You may be able to use a remote-bookmark-sync add-in to move your windows firefox bookmarks to linux - I've not tried that though.

There are other open source browsers. They're great. Use them to test your stuff. But, unless you're being really experimental... stick with Chrome as your primary browser. Or, firefox if you must.

Your browser runs in a window. Please forgive this horrible segue into...

The next subject - windowing.

Linux runs Xwindow. That's a big, complicated windowing system.

It's capable of being "client/server" - you can really EASILY display windows on one computer with the program that's behind those windows running on another computer. And with the networking setup correctly, you can even do that across any virtual machines you might run under Linux as well.

With that immense flexibility comes complication.

There are at least three differing ways you can be "logged in" on linux.

A system-level login is done by "getty" which handles terminal logins and virtual screen logins - the ones you get when you hit ctrl-alt-F1 through ctrl-alt-F8 or so.

Remote logins are handled via the "ssh" service - by default listening on port 22

The Xwindow system will present you with some graphical way of "logging in" if it's running.

When you see a graphical login screen please don't think that's linux. It's not - it's GDM. Do a google image search for "gnome display manager" and you'll see what I mean - this program has been adapted 10 ways from Sunday for various distros, but it's basically about getting someone "logged in" on Xwindow.

Being "logged in" under Xwindow pretty much means... you have a Window Manager running. At a more detailed level, you have an "X window session" and one of the X clients is your window manager. But, most of the time, that means the same thing as "you have a window manager running".

There are a handful of good window managers. Gnome, KDE, xfce, and others. Google "gnome vs kde vs xfce" and read up on them.

For starters... just use whatever window manager your Linux already has.

The Linux command line

Get used to the idea of logging in on a virtual terminal. You can hotkey out of the windowing environment, do some work on a commandline, then hotkey back into X. ctrl-alt-F1, use the shell, ctrl-alt-F7 (probably) and you're back in X.

Also get used to using ssh. You can ssh into your linux computer from elsewhere. You can ssh into any virtual machines you might run (under VirtualBox or VMware or whatever).

AND... you can (a) tell X to allow remote windowing clients to connect from machineA, (b) ssh to machineA, (c) set your DISPLAY environment to point back at your current machine, (d) run X-window programs on that remote machine, which will display on your local machine. This is HUGELY useful.

There are a bunch of Xwindow programs that will emulate terminals for you.

List of terminal emulators

There's "terminal" from the window manager's menu system - that's typically actually gnome-terminal. There's a KDE terminal program, There's Xterm, and there are others as well - yakuake being my absolute favorite.

You'll definitely want to get accustomed to using the commandline for file manipulation, for interacting with version control systems like git, etc. - and possibly for file editing as well.

I totally prefer editing a text file in vim over using a "notepad" or "word"-ish word processor GUI.

Here's an excellent, highly-recommended read regarding "TTYs" and their role in UNIX/Linux - understand this and you're well on your way to understanding linux: The TTY demystified

Owning the command line instead of letting it own you

By now, your coffee's cold, and you're overwhelmed... but don't be! Linux tries to be pretty open and friendly, believe it or not.

At a commandline, type in "man man" to see the manual page on how the manual pages work. And, "man apropos" to learn about a REALLY awesome program for searching through the manual pages.

Apropos is way cool and extremely useful. It's google for manpages. It might take a few tries, but you can find what you're looking for.

Example: ...what was the name of that flight simulator I installed recently?

$ apropos flight
...hrm... no output...
$ apropos simulator
sabresdl (6)         - SVGAlib fighter plane simulator
XRunSabre (6)        - SVGAlib fighter plane simulator
xsabre (6)           - SVGAlib fighter plane simulator

Yup! Sabre. Looks like I need to run XRunSabre to start it.  (Guess what I'll be doing for the next couple of hours?)

There's a hidden little corner in Linux named /usr/share/docs - check it out - there's a tremendous collection of random information in there.

Using Linux from day to day

It's your computer, and you want it to be easy to escalate to root privilege, but don't want to always be root - that's quite unsafe. However, you don't want to have to type in a password anytime you want to escalate to root.

How to do this? Leverage 'sudo'.

First, escalate to root:

su -
(enter root password)

Then, add one line at the end of the file /etc/sudoers to tell sudo your account is allowed to escalate to root. All of the following assumes your username is 'joe' - replace that with your real username.

If you know the basics of the 'vi' editor, run 'visudo' and add an extra line at the bottom of the file:

joe ALL=NOPASSWD: ALL

Or if you want to use the editor "nano" for example,

# which nano
# export EDITOR=/replace/with/path/to/nano
# visudo

...which will let you edit the file in the nano editor (friendlier for beginners)

Lastly, if you're brave you can simply do surgery on the file from the commandline. You'll want to ensure you get the syntax EXACTLY right, or you may find you've broken sudo.

To add a line to the end of the file you can run:

echo "joe ALL=NOPASSWD: ALL" >> /etc/sudoers

BE SURE TO PUT IN TWO > SYMBOLS!   With >> you're adding one line to the END of that file. With only one > you would be replacing the entire file with only that one line - a really bad idea.

For more info on sudo and sudoers use 'man':

man sudoers
man sudo

So, after you've edited the /etc/sudoers file, exit the root shell. You're back to being logged in as 'joe' - but now you can run stuff as root super easy by just typing 'sudo ' followed by whatever commandline you want to run.

Here's an example. At a certain point, you're bound to be interested in "what's running" on the system. Not just as 'joe' - as any user.

You can use the 'pstree' program to see all of the programs. 'man pstree' for details, but for some unfathomable reason I like to use the following options:

sudo pstree -paul

Another example. You want to see what 'services' are running. Services listen on TCP/IP ports and let remote clients connect to those ports. 'man netstat' for details, but this is what you'll want to run:

sudo netstat -ntlp

What IS all of this stuff that's running?

Feel free to ask me or others about various processes and services - google them and learn. A lot of these things can be turned off if they're not needed.

Final notes

Lastly let's discuss synergy and dropbox. Both of these tools can help you with transitioning to a multi-OS solution.

Synergy lets you share your keyboard/mouse across multiple computers - linux/windows/mac - and also copy/paste across them.

Here's the link to synergy: Synergy (and please do support them if you can)

Dropbox is... well... dropbox. Dropbox - Simplify your life

It's a network drive to drop files on. It works well on linux and windows and mac as well, so you can use it to move files back and forth, or just leave them out on dropbox and then you can get to them from anywhere.

If you get stuck on something with Linux, and Google doesn't help, don't hesitate to contact me - I'll be happy to help as best I can.

American Robot Merlin Robots

Overhead-mounted Merlin with forearm extension, welding a car frame

American Robot is gone. It's not a viable company anymore. That's a shame - it was a real contender.

Merlin robots were ahead of their time. They had sub-millimeter accuracy, could position a heavy end-effector like a welder, parts dispenser, or cutting tool very precisely, and they had some pretty advanced programming capabilities.

The sands of time are unforgiving... there's very little data about these robots, or the company, available anymore. So, I'm happy to provide above what I think is one of the very best available pictures of a Merlin robot in action.

Debugging Apache

Most people get overwhelmed at the idea of debugging apache from the commandline, since there are multiple child processes involved, and each one handles who-knows-how-many requests.

But... it can be done!

First, from your local machine, figure out what the server will see as your IP address. http://ipv4.icanhazip.com/

On the server, in the shell you'll be using to strace apache, enter:
export MYIP=1.2.3.4
...replacing 1.2.3.4 with the output you got from icanhazip. That'll set the MYIP environment variable to your remote IP address.

From your local computer, at a commandline, telnet to the server, port 80. Type in:
GET / HTTP/1.1
then hit enter. Then type in:
Host whateverdomain-dot-com
...replacing that with your actual domainname of course. And hit enter. ONCE.

Then, on the server, run the following:
strace -s 999 -p `netstat -ntp|grep $MYIP|grep httpd|awk '{print $7}'|sed 's/\/.*//'`
That'll start strace up. It should show that apache child waiting for input.

Back on your local machine, hit one more enter. That ends the "GET" request.

On the server, you should see the debug strace of apache as it processes that GET request for you.

You'll have to control-C it when it's clear it's now done with your request and gone on to handling some other one.... but the beginning of that strace output will show you exactly what apache did to handle your request.

How many programs are running in which directories?

Every program has a "current working directory".

The following will list all of the current "current working directories" and how many processes are in each one.

For Linux:
readlink /proc/*/cwd|sort|uniq -c
For all Unixes:
lsof|grep cwd|awk '{print $9}'|sort|uniq -c

How much is bad code contributing to mysql connection exhaustion?

Sleeping connections happen when the client code exits without closing the mysql connection.

That happens a lot more often than you might think - developers are happy to open database connections, but think they'll get closed "automatically" when the page is done being rendered.

In apache that's not true - the apache process keeps running and handling other requests. So the connection stays open, wasted, until the server finally closes it after a timeout, or until the apache child process gets recycled (having handled MaxRequestsPerChild requests - often in the thousands)

Here's a quick way to review what the current ratio of sleeping to total connections is for MySQL.

mysqladmin processlist|sed '4,$p'|grep -v ^+|awk -F\| '{c++} index($6,"Sleep") > 0 {s++} END {print s " sleeping connections out of " c}'

Lowering TTLs on DNS records managed by Plesk

I heartily do NOT recommend using single-server DNS solutions such as Plesk's DNS.

That said... sometimes when it's already in use, you might want to lower all of the TTLs on the records, in batch. ...maybe as the first step in helping migrate all of those domains over to another DNS? :-)

You can set the TTL value to any value using this command:

mysql –uadmin p`cat /etc/psa/.psa.shadow` psa –e 'update misc set val = '300' where param = 'SOA_TTL';'

You will then need to run this command to force Plesk to reread the zones files:

mysql -Ns -uadmin -p`cat /etc/psa/.psa.shadow` -D psa -e 'select name from domains' | awk '{print "/usr/local/psa/admin/sbin/dnsmng update " $1 }' | sh

The above shows a cool technique - use awk to construct commandlines that include output from a mysql select... then feed the output to shell.

Kill all MySQL transactions

This is an emergency measure. I don't recommend you use it in production except when there's no other choice.

for i in `mysql -Ne "show processlist" | awk '{print $1}'`; do mysqladmin kill $i; done

Pony up some URLs, Apache!

Apache rocks. But it's certainly not end-user friendly from the commandline.

Check the output of "httpd -S" if you don't believe me.

This helps - it outputs actual URLs. Depending on your terminal emulator, you may be able to click them directly to get them to load in a browser.

httpd -S 2<&1|awk '/port/ {print $4}'|sed 's/^/http:\/\//'

What to do when Apache lies to you, saying "No space left on device"

If you ever find apache logs complaining "No space left on device" when, in reality, there's a LOT of free space on all of your drives... possibly apache has exhausted the pool of IPC semaphores.

Restarting apache won't help - the semaphores are external to the processes. You don't have to resort to rebooting the server - the following should resolve the matter.

# service httpd stop
# for i in `ipcs -s|awk '$3 == "apache" {print $2}'`; do ipcrm -s $i ; done
# service httpd start

The UNIX Swiss Army Knife

I'm a huge fan of awk's associative array handling. Here's an example that leverages this feature - summarizing one output field against another.

ps aux | awk \
'NR>1 {s[$11]=s[$11]+$4} END {for (i in s) {print s[i] " " i}}' \
|sort -n|grep -v '^0 '

This takes the output of 'ps aux' and sums the percentage memory used ($4) for each processname ($11). The processname is used as the index into an associative array named 's'. The array iteration within the END clause (and thus, the output) is in no particular order, so sorting the output is helpful. There are other approaches - see the link at the end of this answer for one alternative. The grep at the very end of the command pipeline omits processes that have used almost no memory.

The end result will look something like this:

2.8 bash
18.3 mysqld
70.5 httpd

To sum the CPU being used instead of memory, just use $3 instead of $4.

To summarize by userID instead of by what program... just use $1 instead of $11 (in both places it's mentioned, of course).

The same technique can be used on logfiles - for example, for most common apache access_log formats, you can quickly sum how many bytes have been transferred to specific IP addresses, or figure out which IPs have been transferring the same page over and over.

(The trick for figuring out which IPs are getting the same pages over and over is to catenate the IP and the pagename into a single string, use THAT as the index into the array, and simply increment a counter at that index.)

The following is FAR from a one-liner - but it does show some of the cool stuff that can be done with awk's associative arrays: https://github.com/PaulReiber/Log-Dissector

Here's another example - a bit simpler - this uses two associative arrays, with the same key, giving us both a counter and a list of entries at a given "index": Paul Reiber's answer to Linux: Which Linux or Windows utility application helps to find duplicated folders?

PBRs blogspot: a lifetime of learning about GNU/linux.