import numpy as np
Often to describe data quality of timelines or images, we use array of integers where each of its bit has a specific meaning, so that we can identify what issues affect each data point.
For example we have 10 data points, and we assign an array of 8 bits for data quality. Generally 0
means a good data point, any bit raised is sign of some problem in the data, this is more compressed then using different boolean arrays, and allows to make batch np.bitwise_and
and np.bitwise_or
operations.
= np.zeros(10, dtype=np.uint8) flag
The array uses just 8 bits per element
%whos
Variable Type Data/Info
-------------------------------
flag ndarray 10: 10 elems, type `uint8`, 10 bytes
np module <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>
Raising a bit seems as easy as adding 2**bit
value to the array, for example the 4th bit is 16, so:
2:5] += 2**4 flag[
flag
array([ 0, 0, 16, 16, 16, 0, 0, 0, 0, 0], dtype=uint8)
The issue is that only works if that bit was 0
, if it was already raised, we would actually zero it and set the higher bit to 1:
2] += 2**4 flag[
flag
array([ 0, 0, 32, 16, 16, 0, 0, 0, 0, 0], dtype=uint8)
Use bitwise operations
Fortunately numpy
supports bitwise operations that make this easier, see the 2 functions below:
def raise_bit_inplace(flag, bit=0):
"""Raise bit of the flag array in place
This function modifies the input array,
it also works on slices
Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to raise
"""
= np.bitwise_or(flag, 2**bit) flag[:]
def raise_bit(flag, bit=0):
"""Raise bit of the flag array
Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to raise
Returns
-------
output_flag : np.array
input array with the requested bit raised
"""
return np.bitwise_or(flag, 2**bit)
def check_bit(flag, bit=0):
"""Check if bit of the flag array is raised
The output is a boolean array which could
be used for slicing another array.
Parameters
----------
flag : np.array
flag bit-array, generally unsigned integer
bit : int
bit number to check
Returns
-------
is_raised : bool np.array
True if the bit is raised, False otherwise
"""
return np.bitwise_and(flag, int(2**bit)) > 0
= check_bit(flag, bit=4) is_bit4_raised
is_bit4_raised
array([False, False, False, True, True, False, False, False, False,
False])
assert np.all(is_bit4_raised[3:5])
They also work with slices of an array:
6:], bit=1) raise_bit_inplace(flag[
flag
array([ 0, 0, 32, 16, 16, 0, 2, 2, 2, 2], dtype=uint8)
# Running it twice doesn't change the value of the flag
6:], bit=1) raise_bit_inplace(flag[
flag
array([ 0, 0, 32, 16, 16, 0, 2, 2, 2, 2], dtype=uint8)
1) check_bit(flag,
array([False, False, False, False, False, False, True, True, True,
True])
First blog post using a Jupyter Notebook with fastpages!!