Get the real file or directory size in unix or linux

Size of DirectoriesAll BASH users (Linux, Unix, OSX, etc) use the ls command, but when we want to know how much disk space has been used, the ls command just doesn’t cut it sometimes. While it is a useful command for listing information about files in a directory, or the directory structure, it doesn’t give you the overall space that a directory uses – including the files inside of it. Sometimes you need a lot more information and the commands to do it are not commonly known. Here’s a typical output of the ls command:


user@localhost:~/Pictures$ ls -lh

 total 644M
 -rw------- 1 user user 993K Jul 12 15:08 IMAG0142.jpg
 -rw------- 1 user user 790K Jul 12 15:08 IMAG0143.jpg
 -rw------- 1 user user 1.1M Jul 12 15:08 IMAG0144.jpg
 -rw------- 1 user user 1.3M Jul 12 15:08 IMAG0145.jpg
 -rw------- 1 user user 1.1M Jul 12 15:08 IMAG0146.jpg
 -rw------- 1 user user 1.1M Jul 12 15:08 IMAG0147.jpg

Notice how the output shows the sizes of the files. However, if we cd .. and look at the directory itself, it shows this:

drwxr-xr-x 2 user user 24K Jul 21 12:02 Pictures

Although it is clearly stated above while in the Pictures directory that the content takes up 644 Megabytes of space, listing the directory itself only shows that it is 24 Kilobytes. That’s a little misleading, don’t you think?

In order to get around this issue, there is a different command that will do the trick; the du command.

user@localhost:~/Pictures$ du -sh
 686M .

The command as typed above shows the combined size of all directories in the present working directory. However, if you were to add a directory name to the command, you would have this output:

user@localhost:~/$ du -sh Pictures
 644M Cell

If you were to use a * instead of a directory name, you would retrieve the results of all of the directories in the current directory:

user@localhost:~/$ du -sh *
 644M Pictures
 2.6M Videos
 4.5M Wallpapers

rsync over SSH

sshMany of us use SSH multiple times on a daily basis times to do simple, complicated, and often redundant tasks. Often times the tasks are those which could be scripted and automated. For instance, if you have to synchronize files with a server often throughout the day, a cron job would be the ideal way because then it will be done automatically and you don’t have to worry about it. If you use SSH keys without a password to access a server, you can expand on it by using rsync to synchronize those local and remote directories.

Here is the command you would use to make this happen:

rsync -e 'ssh -i ~/.ssh/id_rsa' -rulvhtpz /Users/user/file_to_sync user@host.com:~/

rsync options used

-r, recursive throughout directories
-u, skip files that are newer at the destination (meaning only update old files)
-l, copy symlinks as symlinks
-v, verbose; show all output as it happens
-h, display output in human readable format
-t, preserve times of files
-p, preserve permissions
-z, compress files during transfer to preserve bandwith

Making rsync convenient:

rsync is really nice when it comes to automation. Adding rsync to a crontab entry comes really handy. There are all kinds of options for cron – to view them, check out my knowledge base article on it.

If we want rsync to run automatically at 12pm and 4pm, this is what we would do:

Open up your terminal app and type the following:

crontab -e

Add the following lines to the file:

00 12 * * * rsync -e 'ssh -i ~/.ssh/id_rsa' -rulvhtpz /Users/user/file_to_sync user@host.com:~/
00 16 * * * rsync -e 'ssh -i ~/.ssh/id_rsa' -rulvhtpz /Users/user/file_to_sync user@host.com:~/

If you’re really clever

If you are a programmer and want your code to automatically synchronize to a remote server, add a macro to your IDE that somehow that adds the rsync code to a button in. For instance, if you add the rsync command to the save button command, maybe it will kill two birds with one stone.

For more information

Go into your terminal and type man rsync or rsync -h

SSH Tunneling; encrypted surfing with Virtual Hosts

Encrypted SurfingWhether you’re the Secretary of Defense, or just an average Joe trying to survive with some peace of mind and security, encryption is a good thing. Have many virtual hosts on your unencrypted Apache server, but want encryption for whichever virtual host you specify? Here is the solution! Note, this is written for Linux clients – not Windows. You can tweak the instructions to work with Windows by using Putty and creating a tunnel that way.

First of all, here is the command to tunnel for Linux:

ssh -f -L 10000:your_virtual_host.com:80 user@myserver.com -N

Explanation of the above command:

  • ssh starts the ssh client
  • -f forks the ssh client into the background
  • -L forwards the command to the binded source_port:server:destination_port
  • -N tells ssh not to execute a command on the remote server once you are logged in to it

After you have started the tunnel using the command above, you will stay logged into it as long as the terminal is open.

Next, you would open your web browser and go to the following address with the address bar:

http://your_virtual_host.com:10000

By going to that address, it fails? What? Ohhhhh that’s right, you need to add the entry for that site to your hosts file:

Open the file:

vi /etc/hosts

(If you were doing this in Windows, the file is at C:WindowsSystem32driversetchosts)

Add this line to it:

127.0.0.1 your_virtual_host.com

Now go to the address again and it should work:

http://your_virtual_host.com:10000

If you were to run a packet sniffing program such as WireShark, you could monitor your network adapter (wlan0 or eth0 – whichever one you are using) and see that everything going to myserver.com (which is where you are tunneled into via ssh – using it to access you_virtual_host.com) is encrypted! Whoo hoo now you can log into your unencrypted website without worrying that people can see your plain text password going over the network.

If you were to monitor your loopback interface (lo) then you would see all the clear text data – except it never leaves your computer unencrypted.

NOTE: Once you have added your_virtual_host.com to your /etc/hosts file, it will always look for that domain on the local machine which means you need to open to tunnel to access it. As a result, if you try to ssh into your_virtual_host.com you will see that the connection is refused. The way around this is to ssh into a different domain on the same server (notice that I ssh’d into myhost.com instead of my_virtual_host.com).

Every host you add to the 127.0.0.1 line in the /etc/hosts file that is on your server will work on the same port (10000 or whatever port you specify – you can use any port you want that isn’t taken by another program). So, if you have my_other_domain.com on the server and, in your web browser, you go to http://my_other_domain.com then it will work as well with the existing tunnel.

View the next article for setting your computer to automatically start the tunnel when you log in.

If this works for you or not, please comment below and let me know.

Bootable USB installation disk (instead of CD) from CLI

There’s no reason to use a GUI program to create a bootable USB disk. Use CLI

When people download a Linux distribution and wish to install it from a flash disk instead of a CD, some use Unetbootin or similar programs (great programs by the way), but there’s a way to do it via Linux command line that’s just as easy.

Follow the steps below:

  • Download your disk image (linuxDistribution.iso)
  • Delete all data from your USB disk
  • Use the “blkid” command to figure out where your flash drive is mapped; /dev/sdb1, etc
  • If blkid doesn’t work, then use this command instead: sudo ls -l /dev/disk/by-id/*usb*
  • Run the command below, replacing /dev/sdb1 with whatever yours is mapped to:
sudo dd if=~/Downloads/linuxDistribution.iso of=/dev/sdb1 bs=4M; sync

Manually clearing the memory cache in Linux

If your Linux based computer is using a lot of memory and you which to clear the memory caches, then rebooting always helps. However, if you don’t want to reboot, luckily, there is a command that will fix this problem. Typically there is nothing to worry about since Linux is very good at handling memory.

First off, Linux aggressively uses memory for caching so that programs run faster. If another program needs to start and the memory is in use, Linux removes some of the cache to make room. To see what I am talking about, run the following command in your terminal or console program of choice:

free -m

Notice that your cache may take more than half of your memory. In my case, I have 8 GB of RAM and there are times where I only have 50 MB of RAM available. After I ran the command to clear the cache, here was my result (At the time I made this article, I had a VMware virtual machine running Windows 7; it was using 1.5 GB of RAM which explains the heavy amount of used RAM):

me@ubuntu:~$ free -m

total used free shared buffers cached
Mem: 7936 3046 4890 0 14 424
-/+ buffers/cache: 2606 5329
Swap: 1906 0 1906

To make this happen, open terminal and run su – (if you are on Ubuntu, first you have do run sudo passwd and create a root password).

Enter your root password and execute the following command:

sync; echo 3 > /proc/sys/vm/drop_caches

Now if you run free -m again, you will see that your memory is once again free.

How to delete files over x days old in *nix

If you have a server with an automated backup system, and you want backups over x days old to be deleted, this is the way you would do it. I use the command in a shell script that is executed every night at midnight – after the backup has been made.

rm {} ; executes the removal of the files. There is a space inbetween the brackets and the backslash.

Here is the command (My shell script runs as root when this is done):

find /foo/bar/* -mtime +15 -exec rm {} ;

The /etc/skel directory

On *nix servers, you will find a directory at /etc/skel.

Anything that is in this directory will automatically be copied and owned for a new user that is added. This is very useful if you are constantly adding new users to a server who are in need of the same directory structure.

On another note, if you need to add new users, don’t use the useradd command – use the adduser command – it does everything much easier (depending on the *nix distribution).

Add Users in Unix/Linux/OSX

You can use the useradd command like this:

useradd kris

BUT, it makes more work. Try the adduser command like this:

adduser kris

The adduser command has a wizard that guides you through the process of adding users. Make sure you set up the /etc/skel directory first so that the new users will have any files and directories that you wish for them to have.