Saturday, August 14, 2021

Live Streaming using Bokeh

We always wonder if we can stream the data live using python. Today, we will show how this can be achieved.
Here is what we will be able to see once we complete this article.
Let's first create a data set to read from. Here we will use a csv file that will be updated and will be read by Bokeh. This csv file will have three columns. The first column is a year and the other two columns will be random numbers generated. Create a csv file that will have data something like below and saved it in D:\Bokeh\sample_data.csv
Now let's go to our jupyter notebook and will write a code to update the csv file every one second.
from random import randint
import time

n = 1925
for x in range(n, n+50):    
  data = open('d://bokeh//sample_data.csv'"a")    
  data.writelines(f'{x},{randint(0100)},{randint(1060)}\n')    
  data.close()    
  time.sleep(1)

The above script will add the random numbers for the years 1925 to 1974. We will run this once the actual script is ready. Let's open another notebook.
To run Boken, we will need to install two libraries, install if not installed.
pip install bokeh
pip install fsspec
After installation, you can verify the version
Bokeh version 2.3.3
fsspec version 2021.07.0
After successful installation, we will import the below libraries. To update it regularly, we need to define a callback function. Here we have set it to 1s (1000 ms) to check for any update in data.
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.io import curdoc
import pandas as pd

#create figure
f=figure()
source_1=ColumnDataSource(data=dict(year=[],number_1=[]))
source_2=ColumnDataSource(data=dict(year=[],number_2=[]))

f.circle(x='year',y='number_1',size=7,fill_color='blue',line_color='blue',source=source_1, legend_label='Number 1')
f.line(x='year',y='number_1',source=source_1, line_color='blue')
f.circle(x='year',y='number_2',size=7,fill_color='red',line_color='red',source=source_2, legend_label='Number 2')
f.line(x='year',y='number_2',source=source_2, line_color='red')
    
curdoc().add_root(f)
curdoc().add_periodic_callback(update,1000)

On our x-axis, we will limit to show only the last 10 records by setting the value of rollover. Let's create the update function:
#create periodic function
def update():    
  data = pd.read_csv('d://bokeh//sample_data.csv')        
  x = data['year']    
  y1 = data['number_1']    
  y2 = data['number_2']        
  new_data=dict(year=x,number_1=y1)    
  source_1.stream(new_data,rollover=10)    
  new_data2=dict(year=x,number_2=y2)    
  source_2.stream(new_data2,rollover=10)  
  
The full code will look like below:
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.io import curdoc
import pandas as pd

#create periodic function
def update():    
  data = pd.read_csv('d://bokeh//sample_data.csv')        
  x = data['year']    
  y1 = data['number_1']    
  y2 = data['number_2']        
  new_data=dict(year=x,number_1=y1)    
  source_1.stream(new_data,rollover=10)    
  new_data2=dict(year=x,number_2=y2)    
  source_2.stream(new_data2,rollover=10)  
  
#create figure
f=figure()
source_1=ColumnDataSource(data=dict(year=[],number_1=[]))
source_2=ColumnDataSource(data=dict(year=[],number_2=[]))

f.circle(x='year',y='number_1',size=7,fill_color='blue',line_color='blue',source=source_1, legend_label='Number 1')
f.line(x='year',y='number_1',source=source_1, line_color='blue')
f.circle(x='year',y='number_2',size=7,fill_color='red',line_color='red',source=source_2, legend_label='Number 2')
f.line(x='year',y='number_2',source=source_2, line_color='red')
    
curdoc().add_root(f)
curdoc().add_periodic_callback(update,1000)


Save this as a py file. I have saved this in the same folder as our csv file. File saved as d:\bokeh\live_streaming.py
Now, we are all set to see the magic. Open a command prompt window and navigate to the folder (d:\bokeh\) and enter
python -m bokeh serve live_streaming.py
If this doesn't open the graph in the default browser, then copy the highlighted link and paste it into the browser. At this time it will show a static graph. This is now the time to update the csv file. Now run the first notebook we created to update the csv file and check the browser.
Have fun!