Your Typing Sucks More Than Your Vacuum Cleaner !!

Welcome back to another irrelevant and crazy project (* hey hey its important alright!! ). After the successful launch (* me believing it was a crazy hit) of the infamous T-Man (* your AI powered Tinder Wing-man), If you haven’t read it , go read it (* please me wants more views) T-Man.

I welcome you to Protyper.

(* In a dramatic scene)

* wait wait I got one more

Okay enough ! lets get technical now.

As usual I will be dividing the problem into several sub problems and try to solve them individually.

  1. Capturing screen
  2. Detecting characters within the captured image
  3. Controller to detect and respond to user input
  4. Creating notifications to show information

Tackling this problem is fairly easy. There are tons of tools in python to take screenshots. In this project we will be using pyautogui. As pyautogui lets you to capture a specific region of your screen.

There are two options to tackle this problem. The hard option and the easy option. The hard option is that we could create our own neural network from scratch to detect character (*oh come on who does that). The easy option is to use some else’s model (* work smart not hard). So, in this project we will be using pytesseract and OpenCv to detect characters.

Once again there are tons of tools that provides interfaces to listen and control mouse / keyboard events. In this project we will be using pyinput.

Creating notification is not that hard. We could write our own code to access dbus to create notification or use library to do it for us. (* as you have already guessed I am lazy and will be using some library) In this project we will be using plyer to create OS level notifications.

Code

As usual you can find all the implementation on my personal GitHub repository. Pro-Typer

  1. python 3.10 (*cuz why not !! … it came with switch statements yehhhhhh)
  2. pytesseract

For debian system do

sudo apt install tesseract-ocr

Note: Do this if you do not want to add tessaract executable in you PATH manually

  1. Clone the repo
git clone https://github.com/dcostersabin/pro-typer.git

2. Go inside pro-typer

cd pro-typer

3. Install requirements

pip3 install -r requirements.txt
python3 main.py

We will start by creating an abstract class namely Trigger, which will monitor all the key events.

class Trigger(ABC):
........... bla bal bla ........
@abstractmethod
def on_keydown(self, key):
# do stuff
pass

You can see that I have created an abstract method namely on_keydown which will let us write our business logic.

Next, we will be creating somewhat of a helper class which will store starting and ending coordinates of the region to be captured with respect to cursors position.

class ImageSize:............. bla bla bla .......................     def is_valid(self):
if self.start is None or self.end is None:
self.invalid_notify()
return False

height, width = self.height_width
x_valid = self.start[0] < self.end[0]
y_valid = self.start[1] > self.end[1]
h_w_valid = height > 28 and width > 28

status = x_valid and y_valid and h_w_valid

if not status:
self.invalid_notify()

return status

So, the main intuition behind valid region is that x coordinate of start must be less than x coordinate of end and y coordinate of start must be greater than y coordinate of end.

Screen shoot mechanism is simple. I wont be explaining how its done but you might find a weird keyword namely callbacks.

def __init__(self, callbacks=None):
self.image = None
self.callbacks = callbacks
self.image_size = ImageSize()
self.mouse_controller = Controller()
super().__init__()

It is just a fancy way of passing function within a function. Whenever we take a screenshot all the callback functions are called by passing the captured image.

def _call_callbacks(self):
_ = None if self.callbacks is None else [calls(self.image) for calls in self.callbacks]

ImageToChar class inside OCR is just a simple abstract class to detect characters within the passed image and predict the detected character which will be stored in a variable called prediction.

To control all the activities we will be creating a class namely SysController , cuz why not. Adding Sys to a class name makes the class look more important than other classes (* Rip my logic).

The key map goes like.

Voilà you have a OCR on steroids now. Here are the few screenshot of the system.

Running The Script
Pressing Shift And Alt To Capture Starting And Ending Coordinates
Pressing Down Arrow to write the captured characters.

Now you can type gazillion words in seconds. You can use this program to comment on my post (* me promoting myself in my own article)

Who is Sabin Sharma ?

If you are interested in automation you should check my other article Automating Tinder?! Perks Of Being A Programmer

If you are interested in Algorithms you should check my other articles, Molecular Dynamics Simulation of Hard Spheres — Priority Queue In Action With Java , Make Your Own AI Powered Game, NVIDIA Fan Controller For Linux(DIY).

And If you are interested in Cyber Security read my recent article Getting Your Hands Dirty: Exploiting Buffer Overflow Vulnerability In C

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store