Sometimes you want to find out the statistic of your files like I needed to figure out why some File Synchronization Services didn’t work well. Typical questions are:
- How many files are there in a folder?
- How many of those are regular files?
- How many of those are sub-folders?
- What’s the maximum number of nested folders?
By default, Windows and Mac have their built-in Files Manager that give answers to some of these questions. File Explorer on Windows can answer #1, #2, #3 by going to Properties. Finder on Mac can answer #1 by viewing Get Info. Mac doesn’t have a way to find out #2 and #3 with the built-in Finder. And none of the default Files Manager on Windows nor Mac can answer #4.
find can. It is a UNIX command line tool that searches for files in a directory hierarchy.
find is native in Linux, and Mac and Windows have their own version of it. However there are discrepancies among the versions that will trip up users.
Demo folder for illustration
# Make a demo folder $ mkdir demo-find $ cd demo-find/ # Make 2 main folders, one with nested sub-folders inside $ mkdir -p d1/d11/d111 d1/d12 d2 # Create files $ touch f1.txt f2.txt d1/f3.txt d1/d11/f4.txt d1/d11/d111/f5.txt d1/d11/d111/f6.txt d1/d12/f7.txt d2/f8.txt # We'll then have this demo-find/ ├── d1 │ ├── d11 │ │ ├── d111 │ │ │ ├── f5.txt │ │ │ └── f6.txt │ │ └── f4.txt │ ├── d12 │ │ └── f7.txt │ └── f3.txt ├── d2 │ └── f8.txt ├── f1.txt └── f2.txt 5 directories, 8 files
find takes the form of
find [PATH] [OPTION]
Find all files
# This will list all files in the given folder, one per line # This is not yet what we want # Don't forget the trailing slash / $ find demo-find/ demo-find/ demo-find/d1 demo-find/d1/d11 demo-find/d1/d11/d111 demo-find/d1/d11/d111/f5.txt demo-find/d1/d11/d111/f6.txt demo-find/d1/d11/f4.txt demo-find/d1/d12 demo-find/d1/d12/f7.txt demo-find/d1/f3.txt demo-find/d2 demo-find/d2/f8.txt demo-find/f1.txt demo-find/f2.txt
Count the number of all files: pipe
# This will also count the root 'demo-find' folder as 1 $ find demo-find/ | wc -l 14
Count the number of regular files
$ find demo-find/ -type f | wc -l 8
Count the number of directories
# This will also count the root 'demo-find' folder as 1 $ find demo-find/ -type d | wc -l 6
FIND THE MAXIMUM NUMBER OF NESTED FOLDERS
find’s manpage we can use the option
-printf with format
%d to print out the number of nested folders for a certain filepath.
-printf format %d File's depth in the directory tree; 0 means the file is a starting-point. %p File's name.
Let’s try it on our Mac:
$ find demo-find/ -type d -printf '%d:%p\n' find: -printf: unknown primary or operator
Great, we’ve got an error -printf: unknown primary or operator. How come?
It turns out that MacOS is based on BSD UNIX and the
find implementation on BSD doesn’t have the
-printf option, in contrast with GNU
find found on Linux nowadays which has many features added such as
-printf. Similar story for Windows where
FIND is Windows’ own implementation which doesn’t support
Depending on your OS there are different way to acquire GNU
find on your system. It’s part of the package
findutils so we need to install
findutils to get
find. On Mac, use homebrew or MacPorts. On Windows, try Cygwin.
brew on Mac:
# Install 'findutils' to get 'find' $ brew install findutils # Edit PATH to make sure our terminal use GNU's find instead of Apple's find # Edit .bashrc or .bash_profile export PATH="/usr/local/opt/findutils/libexec/gnubin:$PATH" export MANPATH="/usr/local/opt/findutils/libexec/gnuman:$MANPATH" # Restart the terminal or source the bash file $ source .bashrc # Make sure 'find' is now the GNU version # Make sure the path is NOT /usr/bin/find which is the default Apple BSD version $ which find /usr/local/opt/findutils/libexec/gnubin/find
Count max folder’s depth
-printf should now work:
$ find demo-find/ -type d -printf '%d:%p\n' 0:demo-find/ 1:demo-find/d1 2:demo-find/d1/d11 3:demo-find/d1/d11/d111 2:demo-find/d1/d12 1:demo-find/d2
We can see that
3 is the maximum number of folder’s depth.
Count max depth: pipe it through
$ find demo-find/ -type d -printf '%d:%p\n' | sort -n | tail -1 3:demo-find/d1/d11/d111
And there we have it,
3 is the max depth. Sometimes the filepath may be very long, so we can just get the folder’s depth by dropping the
$ find demo-find/ -type d -printf '%d\n' | sort -n | tail -1 3