Often to describe data quality of timelines or images, we use array of integers where each of its bit has a specific meaning, so that we can identify what issues affect each data point.

For example we have 10 data points, and we assign an array of 8 bits for data quality. Generally 0 means a good data point, any bit raised is sign of some problem in the data, this is more compressed then using different boolean arrays, and allows to make batch np.bitwise_and and np.bitwise_or operations.

import numpy as np
flag = np.zeros(10, dtype=np.uint8)

The array uses just 8 bits per element

%whos
Variable   Type       Data/Info
-------------------------------
flag       ndarray    10: 10 elems, type `uint8`, 10 bytes
np         module     <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>

Raising a bit seems as easy as adding 2**bit value to the array, for example the 4th bit is 16, so:

flag[2:5] += 2**4
flag
array([ 0,  0, 16, 16, 16,  0,  0,  0,  0,  0], dtype=uint8)

The issue is that only works if that bit was 0, if it was already raised, we would actually zero it and set the higher bit to 1:

flag[2] += 2**4
flag
array([ 0,  0, 32, 16, 16,  0,  0,  0,  0,  0], dtype=uint8)

Use bitwise operations

Fortunately numpy supports bitwise operations that make this easier, see the 2 functions below:

def raise_bit_inplace(flag, bit=0):
    """Raise bit of the flag array in place
    
    This function modifies the input array,
    it also works on slices
    
    Parameters
    ----------
    flag : np.array
        flag bit-array, generally unsigned integer
    bit : int
        bit number to raise
    """
    flag[:] = np.bitwise_or(flag, 2**bit)
def raise_bit(flag, bit=0):
    """Raise bit of the flag array

    Parameters
    ----------
    flag : np.array
        flag bit-array, generally unsigned integer
    bit : int
        bit number to raise
        
    Returns
    -------
    output_flag : np.array
        input array with the requested bit raised
    """
    return np.bitwise_or(flag, 2**bit)
def check_bit(flag, bit=0):
    """Check if bit of the flag array is raised

    The output is a boolean array which could
    be used for slicing another array.
    
    Parameters
    ----------
    flag : np.array
        flag bit-array, generally unsigned integer
    bit : int
        bit number to check
        
    Returns
    -------
    is_raised : bool np.array
        True if the bit is raised, False otherwise    
    """
    return np.bitwise_and(flag, int(2**bit)) > 0
is_bit4_raised = check_bit(flag, bit=4)
is_bit4_raised
array([False, False, False,  True,  True, False, False, False, False,
       False])
assert np.all(is_bit4_raised[3:5])

They also work with slices of an array:

raise_bit_inplace(flag[6:], bit=1)
flag
array([ 0,  0, 32, 16, 16,  0,  2,  2,  2,  2], dtype=uint8)
# Running it twice doesn't change the value of the flag
raise_bit_inplace(flag[6:], bit=1)
flag
array([ 0,  0, 32, 16, 16,  0,  2,  2,  2,  2], dtype=uint8)
check_bit(flag, 1)
array([False, False, False, False, False, False,  True,  True,  True,
        True])

First blog post using a Jupyter Notebook with fastpages!!