Often to describe data quality of timelines or images, we use array of integers where each of its bit has a specific meaning, so that we can identify what issues affect each data point.

For example we have 10 data points, and we assign an array of 8 bits for data quality. Generally 0 means a good data point, any bit raised is sign of some problem in the data, this is more compressed then using different boolean arrays, and allows to make batch np.bitwise_and and np.bitwise_or operations.

import numpy as np

flag = np.zeros(10, dtype=np.uint8)


The array uses just 8 bits per element

%whos

Variable   Type       Data/Info
-------------------------------
flag       ndarray    10: 10 elems, type uint8, 10 bytes
np         module     <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>


Raising a bit seems as easy as adding 2**bit value to the array, for example the 4th bit is 16, so:

flag[2:5] += 2**4

flag

array([ 0,  0, 16, 16, 16,  0,  0,  0,  0,  0], dtype=uint8)

The issue is that only works if that bit was 0, if it was already raised, we would actually zero it and set the higher bit to 1:

flag[2] += 2**4

flag

array([ 0,  0, 32, 16, 16,  0,  0,  0,  0,  0], dtype=uint8)

## Use bitwise operations

Fortunately numpy supports bitwise operations that make this easier, see the 2 functions below:

def raise_bit_inplace(flag, bit=0):
"""Raise bit of the flag array in place

This function modifies the input array,
it also works on slices

Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to raise
"""
flag[:] = np.bitwise_or(flag, 2**bit)

def raise_bit(flag, bit=0):
"""Raise bit of the flag array

Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to raise

Returns
-------
output_flag : np.array
input array with the requested bit raised
"""
return np.bitwise_or(flag, 2**bit)

def check_bit(flag, bit=0):
"""Check if bit of the flag array is raised

The output is a boolean array which could
be used for slicing another array.

Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to check

Returns
-------
is_raised : bool np.array
True if the bit is raised, False otherwise
"""
return np.bitwise_and(flag, int(2**bit)) > 0

is_bit4_raised = check_bit(flag, bit=4)

is_bit4_raised

array([False, False, False,  True,  True, False, False, False, False,
False])
assert np.all(is_bit4_raised[3:5])


They also work with slices of an array:

raise_bit_inplace(flag[6:], bit=1)

flag

array([ 0,  0, 32, 16, 16,  0,  2,  2,  2,  2], dtype=uint8)
raise_bit_inplace(flag[6:], bit=1)

flag

array([ 0,  0, 32, 16, 16,  0,  2,  2,  2,  2], dtype=uint8)
check_bit(flag, 1)

array([False, False, False, False, False, False,  True,  True,  True,
True])

First blog post using a Jupyter Notebook with fastpages!!