Your mission: save some jpeg metadata mislaid by Windows XP.

Code junkies hangout here

Moderators: ChrisThornett, LXF moderators

Your mission: save some jpeg metadata mislaid by Windows XP.

Postby RedWillow » Tue Sep 28, 2010 1:07 pm

What I know about shell scripting can be written on the back of a ha'penny stamp, so please be gentle. This is a problem I got involved with on a Mac forum, but the script I've come up with works equally well on a Mac or in Linux so long as you have the command line utility exiftool installed in MacOS, libimage-exiftool-perl in Ubuntu, exiftool in Opensuse or perl-image-exiftool in Fedora. I've also tried it with exif in Ubuntu and Opensuse (with 'exif' instead of 'exiftool' in the script) and it works but with an oddity in Ubuntu - see below. I've posted the exiftool versions.

This all started with someone converting from Windows XP to MacOS and migrating all their photos to their Mac. He'd added personal metadata in XP but couldn't read it in MacOS. In an XP explorer window you right-click on the jpeg, then Properties > Summary where you can add comments for 5 fields. Trouble is, these appear to be non-standard EXIF data and all the Mac GUI utilities I tried couldn't read the 'XP' metadata - only the standard subset. The terminal utilities exiftool and exif could - the XP fields are called XP* - so this was my first working version of a script:

Code: Select all
#!/bin/sh
# Simple script to search home folder recursively,
# print filename of all files with extension jpg or JPG
# together with XP EXIF data if present

echo "" > $HOME/Desktop/picture_list.txt
find $HOME -type f -iname "*jpg" |
while read file ; do
  echo "${file}" >> $HOME/Desktop/picture_list.txt
  exiftool "${file}" | grep XP >> $HOME/Desktop/picture_list.txt
done


Which works well enough, but as you can see it only searches for .jpg and .JPG. It needs to search for .jpe and .jpeg as well. I came up with this:

Code: Select all
#!/bin/sh
# Simple script to search home folder recursively,
# print filename of all files with extensions jpg, JPG,
# jpeg, JPEG, jpe or JPE,
# together with XP EXIF data if present

ext="*jpg *jpeg *jpe"
echo "" > $HOME/Desktop/picture_list.txt
for i in $ext ; do
  find $HOME -type f -iname $i |
    while read file ; do
      echo "${file}" >> $HOME/Desktop/picture_list.txt
      exiftool "${file}" | grep XP >> $HOME/Desktop/picture_list.txt
    done
done


But that's inelegant. Using a for i in $ext loop it searches the home folder for jpg/JPG, then jpeg/JPEG and lastly jpe/JPE. Three passes through $HOME takes a long time and the output is mucky. What I would like to do is to have an OR operator in the find line. I found something about using "[ blah ] || [blah ] || [blah]" but I couldn't work out how to use it in the find line. Any suggestions, please?

The oddity in Ubuntu (Maverick) affects both exif and exiftool. If you run the longer script from the Desktop it only scans the desktop. You have to run it from home for it to work as described on the tin. But the shorter script works wherever you run it from. Weird. :? I'll try Lucid later.* It may be a Maverick bug.

I've probably managed to frighten the OP of the Mac thread away with mention of terminals and scripts, but I'd like to tidy this up and post it somewhere in a howto. It could be useful for both Linux and Mac users migrating from XP.

* Edit: no the same in Lucid. The longer script run from the desktop only scans the desktop. The shorter script works OK. Doubly weird, since it works fine in Fedora and Suse.
RedWillow
LXF regular
 
Posts: 719
Joined: Thu May 29, 2008 1:05 pm

Postby nelz » Tue Sep 28, 2010 1:39 pm

You could use the file command to find all JPEG files, irrespective of the filename.

Code: Select all
find -type f | file -f - | awk -F : '/JPEG\ image\ data/ {print $1}'
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8495
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Postby RedWillow » Tue Sep 28, 2010 4:20 pm

nelz wrote:
Code: Select all
find -type f | file -f - | awk -F : '/JPEG\ image\ data/ {print $1}'


That works beautifully - thanks, nelz. You are a scholar and a gentleman, sir.

Here's my amended script:

Code: Select all
#!/bin/sh
# Simple script to search home folder recursively,
# print filename of all JPEG files
# together with XP EXIF data if present

echo "" > $HOME/Desktop/picture_list.txt
find $HOME -type f | file -f - | awk -F : '/JPEG\ image\ data/ {print $1}' |
while read file ; do
  echo "${file}" >> $HOME/Desktop/picture_list.txt
  exiftool "${file}" | grep XP >> $HOME/Desktop/picture_list.txt
done


One benefit is that it neatly sidesteps the odd issue with using the script with the for-done loop on the Ubuntu desktop. Only one slight problem: it lists all the JPEGs in the firefox cache, in ~/.mozilla/firefox/randomstring.default/Cache/. That doesn't embarrass me but it might embarrass some. :)

I have no idea how that line works - at the moment the back of my ha-penny stamp is not large enough to encompass it. But, fear not, I shall do my homework and work it out.

Onward and upward, and thanks once again.
RedWillow
LXF regular
 
Posts: 719
Joined: Thu May 29, 2008 1:05 pm

Postby nelz » Tue Sep 28, 2010 4:30 pm

find finds all files, the list is piped to file, which prints out the path and type of each file, this is piped to awk which filters out the lines containing "JPEG image data" and prints the first thing on the line, the path. It uses : as a separator so it works with files that contain spaces too.

find has options to exclude path or file patterns, so you could skip potentially embarrassing files, or use privacy mode when visiting *those* sites :)
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8495
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Postby RedWillow » Tue Sep 28, 2010 5:06 pm

Oh dear. Something odd is happening in MacOS. When I run that latest script, I get this:

Code: Select all
./testscript: line 9:  : command not found
./testscript: line 10:  : command not found


... repeated many times in the terminal after a short delay. Then a longer delay. Then the above repeated a few times and then the script terminates. The resulting picture_list.txt file is empty apart from one carriage return. It worked fine in Ubuntu. :(

Your line gives the expected output and lines 9 and 10 were working just fine in my earlier scripts. I'll investigate more and post back - probably not before tomorrow.

In the meantime...

nelx wrote:find has options to exclude path or file patterns, so you could skip potentially embarrassing files, or use privacy mode when visiting *those* sites :)


Since I was originally doing this for a Mac user, I don't think *those* jpegs will be much of a problem in the Mac world. According to NewsBiscuit. :P
RedWillow
LXF regular
 
Posts: 719
Joined: Thu May 29, 2008 1:05 pm

Postby nelz » Tue Sep 28, 2010 6:52 pm

Try running it with "sh -x ./script" to see exactly what is breaking.
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8495
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Postby RedWillow » Tue Sep 28, 2010 7:51 pm

Success!

I had to change $HOME to $HOME/Desktop in the find line and have just one jpeg on the desktop not to be overwhelmed with data, after which your "sh -x ./script" gave me:

Code: Select all
Macintosh:Desktop RedWillow$ sh -x ./testscript
+ echo ''
+ find /Users/RedWillow/Desktop -type f
+ file -f -
+ awk -F : '/JPEG\ image\ data/ {print $1}'
+ read file
+   echo /Users/RedWillow/Desktop/YellowFlower.jpg
./testscript: line 9:  : command not found
+   exiftool /Users/RedWillow/Desktop/YellowFlower.jpg
+ grep XP
./testscript: line 10:  : command not found
+ read file
Macintosh:Desktop RedWillow$ sh -x ./testscript


Which didn't tell me much more at first, but then I noticed the indented echo and exiftool lines and had a (rare) brainwave. I edited the script so that it looked like:

Code: Select all
#!/bin/sh
# Simple script to search home folder recursively,
# print filename of all JPEG files
# together with XP EXIF data if present

echo "" > $HOME/Desktop/picture_list.txt
find $HOME -type f | file -f - | awk -F : '/JPEG\ image\ data/ {print $1}' |
while read file ; do
echo "${file}" >> $HOME/Desktop/picture_list.txt
exiftool "${file}" | grep XP >> $HOME/Desktop/picture_list.txt
done


And then it worked just fine. It looks as though the indenting spaces were upsetting the bash shell in MacOS, which is strange because when I was taught elementary Pascal programming (giving my age away there :() I was told to indent do-while loops and whatever to be able to see what was going on. Clearly not in the Mac world. Any idea why MacOS should object to good scripting practice?

Anyway - to end on a lighter note, and to prove that the Ubuntu GUI is superior to the MacOS GUI, the OP of that Mac forum thread really wanted a GUI tool to examine the metadata of each photo one-by-one. With 19000 jpegs (so he says) to sort through he clearly wants a project to occupy the long, dark, winter evenings. This is what I saw in Windows XP when I modified a jpeg to include the XP metadata:

Image

All I had to do in Ubuntu was to install the eog-plugins package, open YellowFlower.jpg with eye of gnome, et voilà:

Image

And I'm sure KDE could do that as well. MacOS though? No. I couldn't find a single GUI app that would display that data.

Thanks for all your help, nelz. Much appreciated! Any further comments will be gratefully received.
RedWillow
LXF regular
 
Posts: 719
Joined: Thu May 29, 2008 1:05 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 2 guests