Inside the mind(s) of a mad man: python

Showing posts with label python. Show all posts

Friday, March 1, 2013

Importing 16-bit TIFFs into NumPy

A colleague approached me today asking whether I've tried importing TIFFs into Python for image processing with SciPy. My image processing experience has been mainly focussed within a MATLAB environment, so we set about going through this together. Unfortunately, none of the images we tried to process with imread and imshow would work properly and would only display a white result. Inspection of the imread output showed that it returned a matrix of 255s. This was despite the source TIFF containing a greyscale image of something more interesting. Further investigation was thus warranted.

I firstly tried looking at the properties of the greyscale TIFF image and saw it was saved in 16 bits. I initially overlooked this thinking that this was the default. To cut a long story short, I realise now that this was the key fact as downsampling the file to 8 bits imported the TIFF file without any problem. So we were faced with a dilemma. Do we: 1. convert an entire library of TIFF files from 16 bits to 8 bits, 2. revert back to using MATLAB, or 3. try and get imread to open 16 bit TIFFs. Option 1 was immediately rejected as the library was enormous leaving options 2 and 3. We decided to continue with Python and investigate a solution.

A quick search on the Internet revealed others had experienced this problem, but few offered workable solutions. One such place was from another blogger, Philipp Klaus, who listed several methods to overcome this problem. Installing some of the packages required to accomplish this task proved too challenging on the Mac with its limited Python library, so I stopped. Fortunately, my colleague picked up where I left off and found an interesting comment on this site, posted by Mathieu Leocmach. This included a snippet of code that my colleague had tried on the data but did not work as expected. Here's the code in full:


import numpy as np
import Image

def readTIFF16(path):
    """Read 16bits TIFF"""
    im = Image.open(path)
    out = np.fromstring(
        im.tostring(), 
        np.uint8
        ).reshape(tuple(list(im.size)+[2]))
    return (np.array(out[:,:,0], np.uint16)<<8)+out[:,:,1]

Overviewing the code, I noticed the variable out consists of three dimensions. The third dimension translates to which one of the two bytes to use. According to the code, the most significant byte is in the index [:, :, 0], whereas the least significant is at [:, :, 1]. When we tried it out, the result of this sum produced mostly noise but certain elements looked to have some order. Further analysis revealed the more interesting aspects of the image were contained with [:, :, 0] instead, and the noise was contained within the other. However, there was also some order to index [:, :, 1], despite being mostly noise, so we couldn't simply reject this byte. This indicated a possible byte sequence shift which may be the result of Endian-ness. Changing the code to the following allowed us to see the image:


import numpy as np
import Image

def readTIFF16(path):
    """Read 16bits TIFF"""
    im = Image.open(path)
    out = np.fromstring(
        im.tostring(), 
        np.uint8
        ).reshape(tuple(list(im.size)+[2]))
    return (np.array(out[:,:,1], np.uint16)<<8)+out[:,:,0]

But if we're writing code like this, there's no reason why we're splitting the original image into two byte sequences. We could accomplish the same task by immediately using np.uint16 instead of np.uint8 and removing the need for a third dimension in the output. I settled with the following bit of code that worked well for us. Note that this code could, of course, be tidied up further, but I won't do that here.


import numpy as np
import Image

def readTIFF16(path):
    """Read 16bits TIFF"""
    im = Image.open(path)
    out = np.fromstring(
        im.tostring(), 
        np.uint16
        ).reshape(tuple(list(im.size)))
    return out

Now we're able to use 16 bit greyscale images within Python without having to resort to converting entire image libraries or using other packages or software. There's the possibility that the original code will work for some people and our version looks like white noise. I believe this is possibly due to Endian differences between the data and the computer. Hopefully future versions of Python will address this issue and making the necessary checks to TIFF source files so that we don't have to find workarounds. Not that we don't enjoy finding and creating these solutions!

Wednesday, February 24, 2010

A Problem when Importing in Python

Last night, I had some problems importing a Cython created object within Python. The silly thing being I remembered encountering this problem before on a Linux-based setting and didn't make a note of what I had done to overcome the problem. What didn't help was the error message being given by Python in saying that one of the functions in the created C file was problematic, and what was worse is that one computer worked okay whilst this one didn't despite following the same procedure.

It was time to delve into the potential problems: permissions, distribution incompatibility, some bug with the script or something else. The first I had encountered when using my Ubuntu machine – it was particularly unfriendly in that I had to ensure everyone had execution privileges to run my library. I didn't have to sudo on my Mac, so privilege issues would be surprising more than anything. The second was equally unlikely as I was using the same Python "kitchen sink included" distribution (Enthought's EPD) on my other Mac which worked unhindered. Implicitly, it ruled out the third option which only left "something else" which didn't really help.

After getting frustrated, replacing the compiler (Apple's XCode tool) and was still in the same mess, I attempted to copy a pre-built library from one Mac onto the other (they're virtually identical machines, so shouldn't have been a problem), yet still, I encountered the error message mentioning a problem with the C code despite the library having been built successfully.

It was only after having used iPython I began to notice that the first time I imported my Cython objected I would encounter the error – the second time would always work unhindered. This suggested to me there was perhaps a conflict in the libraries somewhere. Sure enough, in the directory I was executing the code, a misplaced (and ancient) .so compiled library file was there and not in the build directory as it should have been. Upon deleting it from the base directory, importing the library worked first time.

I'm mainly writing this so that when I encounter this problem again (and I'm very likely to), I can hopefully remember at least I wrote about it somewhere. It should have been one of the first things I looked for, but failed to as it worked perfectly on one computer with the same suite of applications. Poor excuse, I know!

Saturday, June 6, 2009

Contents of Objects in Python

Lately, I've been programming a lot in Python which has proven to be quite enlightening for me. My method of learning a new programming language has been to solve a problem through software and learn syntax, functions and layout on my journey. Normally, I'd look at pre-existing libraries or even source code in that language to get an idea of the things that are available. Python was no different, but to aid my learning, I chose to use an Interactive interface to Python: iPython.

One thing always puzzled me: if I created an object, say, myObject as instance A (that is to say A = myObject()), if I typed in A, how could I control the output without viewing something like <instance 'myObject' object at 0xlocation> but instead show something more useful?

Nowhere could I find it explicitly written although now I know what to look for, I see documentation on this. When calling A in either iPython or Python, the interpreter looks for A.__repr__() to display some useful information. I took the opportunity to display a summary of the data contained by the object. This is done by returning a string object (or str in Python) when creating the class. Likewise, if I were to write print A, Python would first look for A.__str__() and, if it was undefined, reverts to returning A.__repr__(). Thus, it is sometimes useful to display more information in the __repr__() call and leave the __str__() for minimalist information (although this will depend on the application). For example:

class myObject: def __init__(self, contents): self.contents = contents def __repr__(self): return "Contents of container: " + str(self.contents) def __str__(self): return str(self.contents)

Indeed, one may notice the difference between the __repr__() function of a NumPy array call and one from __str__(): the former including the words "array" at the start and encloses the output from the __str__() function call.

Inside the mind(s) of a mad man

Friday, March 1, 2013

Importing 16-bit TIFFs into NumPy

Wednesday, February 24, 2010

A Problem when Importing in Python

Saturday, June 6, 2009

Contents of Objects in Python

About Me

Blog Archive

Categories

Work/Academic Links

Other (fun) links

Twitter

Twitter