thirdPartyPlugins - Third-party plugins
Third-party plugins
This page lists Auto.js plugins developed by third-party developers.
Pytorch plugin
This plugin and documentation are developed and provided by the third-party developer
Haoran. Special thanks.
To use this module, install the Pytorch-AjPlugin extension plugin, then load it with let pytorch = $plugins.load("com.hraps.pytorch").
The Pytorch module runs trained deep-learning neural network models on Android devices. It can implement features that are hard to achieve with conventional programs, such as image recognition, translation, and Q&A.
Before use, make sure you have a trained neural network model, and convert the model file into a mobile-friendly TorchScript file using Python, e.g. torch.jit.trace(model, input_tensor).
If you are not familiar with Pytorch, the author (Haoran) recommends learning here: Dive into DL (PyTorch).
You can test device support with pytorch.debugOd() and preview the recognition quality for object detection.
Exporting Pytorch weights
You need to convert .pt to .torchscript.pt to run on mobile. Use tracing to convert the model into a mobile-usable form. Models that rely on third-party libraries may not work well; it is recommended to write and train models in pure Pytorch.
Python conversion script on desktop:
model = make_model('resnet18') # build model structure
model.load_state_dict(torch.load(model_pt)) # load trained weights
model.eval() # set eval mode (required)
input_tensor = torch.rand(1,3,224,224) # input format: 224x224 RGB(3), random tensor
mobile = torch.jit.trace(model, input_tensor,strict=False) # trace to TorchScript
mobile = optimize_for_mobile(mobile) # optional mobile optimization
mobile.save(mobile_pt) # save fileBasic functions
pytorch.load(path[,device])
path{String} Model pathdevice{int} Execution device: 0 CPU, 1 VULKAN. Default 0.- Returns {PytorchModule}
Load a neural network model. Only a small number of devices support VULKAN; you can omit device.
When path is "yolov5s", it loads the built-in object detection model. When path is "textcnn", it loads the built-in sentiment analysis model.
pytoch.forward(module,input)
module{PytorchModule} Loaded neural network moduleinput{Tensor} Input tensor- Returns {Tensor} Output tensor
Run forward inference and return the output tensor.
pytoch.forwardTuple(module,input)
module{PytorchModule} Loaded neural network moduleinput{Tensor} Input tensor- Returns {Tensor} Output tensor
Run forward inference and return the output tensor.
Compared to pytorch.forward(), this returns the first item of a tuple output, suitable for models like object detection. Internally it is essentially module.forward(IValue.from(inputTensor)).toTuple()[0].toTensor().
pytorch.destory(module)
module{PytorchModule} Neural network module to release
Release the neural network module.
Tensor class
Tensor is the common input/output data structure for neural networks. It is a high-dimensional array that enables fast processing. For example, an image of 100×200 with 3 RGB channels is converted to a float array of length 100×200×3 before feeding into the model.
pytorch.fromBlob(arr,shape)
arr{List} JS arrayshape{List} Tensor shape- Returns {Tensor} Constructed tensor
Construct a tensor from a JS array.
tensor.getDataAsFloatArray()
- Returns {List} Float array converted from the tensor
Convert the tensor to a float array.
tensor.getDataAs[Byte/Double/Float/Int/Long]Array()
- Returns {List<…>} Array converted from the tensor
Convert the tensor to arrays of various primitive types.
ObjectDetection - object detection functions
Object detection analyzes the location and category of items in an image. If you are unsure, use pytorch.debugOd() to preview the effect.
pytorch.debugOd([modulePath,classPath])
modulePath{String} Model file path (.pt/.pth)classPath{String} Class labels file path (.txt)
Test an object detection model file and open the built-in debug page.
Call with no args to use the built-in OD weights and test device support, i.e. pytorch.debugOd().
pytorch.liveOd(modulePath,classPath)
modulePath{String} Model file path (.pt/.pth)classPath{String} Class labels file path (.txt)
Enter the camera real-time recognition page to view streaming results.
ObjectDetection (OD) - commonly used functions
Common helper functions for building your own object detection pipeline.
pytorch.getIOU(a,b)
a{Rect} Rect Ab{Rect} Rect B- Returns {float} IoU value
Compute the overlap ratio (IoU) of two rectangles.
pytorch.bitmapToTensor(img[,mean,std])
img{Bitmap} Source image (Bitmap)mean{List} Normalization mean. Default [0.485, 0.456, 0.406]std{List} Normalization std. Default [0.229, 0.224, 0.225]- Returns {Tensor} Tensor converted from the image
Convert an image to a Tensor for model input. The image must be a Bitmap. If you have an Auto.js Image, use image.getBitmap() to convert.
mean and std are used to normalize image colors. Use the same values as during training. If unsure, you can set mean [0,0,0] and std [1,1,1].
img = images.read("/sdcard/a.jpg");
inputTensor = pytorch.bitmapToTensor(img.getBitmap(),[0,0,0],[1,1,1]);pytorch.floatsToResults(floats,row,column,imgScaleX,imgScaleY[,threshold])
floats{List} Output array from a YOLO modelrow{int} Number of rows (detections)column{int} Data width per detectionimgScaleX{float} Image scale factor on XimgScaleY{float} Image scale factor on Ythreshold{float} Minimum confidence threshold to keep- Returns {List} All converted results
Convert YOLO output array to a results list. floats is the full output data; floats.length should equal row * column.
If you resized the image to a fixed input size for the network, you can use imgScaleX/imgScaleY to map result coordinates back to the original image.
YOLO model output is composed of detection cells. Each detection consists of confidence, box location, and per-class probabilities. For example, in the built-in YOLO model, each output contains 4 box values (x,y,w,h), 1 confidence value, and 80 class probabilities, so column = 4 + 1 + 80 = 85. row is the number of detection cells.
pytorch.useNMS(results[,limit,threshold])
results{List} All resultslimit{int} Max remaining results limitthreshold{float} Box overlap threshold- Returns {List} Results after NMS
Filter duplicate results. NMS (Non-Max Suppression) removes boxes whose overlap is higher than threshold.
OdResult class
Represents a single object detection result, containing rect, score, and classId.
odResuult.score
score{float} Confidence score
odResult.rect
rect{Rect} Bounding box. See theimagesmodule docs for Rect.
odResult.classIndex
classIndex{int} Class index
NaturalLanguageProcessing (NLP) functions
Provides helper functions for NLP models.
pytorch.debugTec([modulePath,vocabPath])
modulePath{String} Model file path (.pt/.pth)vocabPath{String} Vocabulary ids file path (.txt)
Test a TextCNN sentiment analysis model file and open the built-in debug page.
Call with no args to use the built-in sentiment weights and test device support, i.e. pytorch.debugTEC().
pytorch.simplifySentence(sentence)
sentence{String} Input sentence- Returns {String} Simplified sentence
Simplify an English sentence: keep only letters and digits, and convert to lowercase.
Vocab class
Vocabulary class that provides efficient conversion between words and ids.
pytorch.vocabPath(path)
path{String} Vocabulary file path- Returns {Vocab} Vocab instance
The file should contain one word per line. The line number ↔ word mapping must match the word embedding / vocab used during training.
pytorch.vocab(words)
words{List} Word list- Returns {Vocab} Vocab instance
vocab.size()
- Returns {long} Vocab size
Get the number of words in the vocab.
vocab.getWord(id)
id{long} Word id- Returns {String} Word text
Get the word for the given id.
vocab.getId(word)
word{String} Word text- Returns {long} Word id
Get the id for the given word.
vocab.getWords(ids[,length])
ids{List} Word idslength{int} Returned list length- Returns {List} Word texts
Get the list of words for multiple ids.
vocab.getIds(words[,length])
words{List} Word textslength{int} Returned list length; pad missing items with 0- Returns {List} Word ids
Get the list of ids for multiple words.
Examples
Image object detection (YOLOv5s)
/**
* Pytorch plugin - object detection (YOLOv5) example
*
* Author: Haoran (Q:2125764918)
*/
// Quick demo (visual debug page):
/*
pytorch = $plugins.load("com.hraps.pytorch")
pytorch.debugOd()
exit()
*/
// Load plugin module
pytorch = $plugins.load("com.hraps.pytorch")
// Load neural network model (built-in YOLOv5s)
var model = pytorch.load("yolov5s")
// Load class names (string array). You can hardcode it, e.g. ["car","plane","person"...]
var classes = pytorch.getCocoClasses()
// Define model input width/height. Input shape: w*h*3
var inputWidth = 640
var inputHeight = 640
// Define output shape: row*column
// row depends on input size (grid count)
var outputRow = 25200
// column = box(x,y,w,h)=4 + score=1 + class probs(80 for COCO) => 85
var outputColumn = 4 + 1 + 80
// Load input image
var img = images.read("/sdcard/DCIM/Camera/b.jpg")
// Resize to model input size
var inputImg = images.resize(img, [inputWidth, inputHeight])
// Convert image to tensor. Use mean=[0,0,0], std=[1,1,1] (no special normalization)
inputTensor = pytorch.bitmapToTensor(inputImg.getBitmap(), [0, 0, 0], [1, 1, 1])
// Run inference to get output tensor
output = pytorch.forwardTuple(model, inputTensor)
// Tensor -> float array
f = output.getDataAsFloatArray()
log("Output length: " + f.length)
// Scale factors
imgScaleX = img.getWidth() / inputWidth
imgScaleY = img.getHeight() / inputHeight
// Convert outputs to results (map back to original image coordinates)
results = pytorch.floatsToResults(f, outputRow, outputColumn, imgScaleX, imgScaleY)
log("Initial detections: " + results.size())
// Apply NMS to remove duplicates
nmsResults = pytorch.useNMS(results)
toastLog("Final detections: " + nmsResults.size())
// Iterate results
for (var i = 0; i < nmsResults.size(); i++) {
result = nmsResults.get(i)
rect = result.rect
str = "Class: " + classes.get(result.classIndex) + " Score: " + result.score +
" Box: left=" + rect.left + " top=" + rect.top + " right=" + rect.right + " bottom=" + rect.bottom;
log(str)
}Sentiment analysis (TextCNN)
/**
* Pytorch plugin - sentiment analysis (TextCNN) example
*
* Author: Haoran (Q:2125764918)
*/
// Quick demo:
/*
pytorch = $plugins.load("com.hraps.pytorch")
pytorch.debugTec()
exit()
*/
// Input sentence
var text = "The program is useful!"
// Load plugin module
pytorch = $plugins.load("com.hraps.pytorch")
// Load neural network model (built-in textcnn)
var model = pytorch.load("textcnn")
// Load vocab. One word per line. Line number should match the ids used in training (built-in vocab used here)
var vocab = pytorch.getTextcnnVocab()
log("Vocab loaded. Size=" + vocab.size());
// Simplify sentence: remove punctuation, lowercase
var textSimple = pytorch.simplifySentence(text)
log("Simplified: " + textSimple);
// Model input length
var inputSize = 128
// Convert words to ids. Unknown words and padding are filled with 0
var ids = vocab.getIds(textSimple.split(" "), inputSize)
log(ids)
// Build Tensor for model input
var inputTensor = pytorch.fromBlob(ids, [1, 128])
// Run inference
var outPutTensor = pytorch.forward(model, inputTensor)
// Output tensor -> float array
var result = outPutTensor.getDataAsFloatArray()
log("Model output: " + result[0] + " " + result[1])
// Interpret result
console.info("Sentence: " + text)
if (result[0] <= result[1]) {
console.info("Result: positive")
} else {
console.info("Result: negative")
}OCR plugin
This plugin and documentation are developed and provided by the third-party developer
Haoran. Special thanks.
This module recognizes text in images (see examples at the end). It is implemented with a deep-learning pipeline: DbNet + AngleNet + CrnnNet.
Download: LanZou Cloud
Recognition
ocr.detect(img[,ratio])
img{Bitmap} Image to recognize (Bitmap). For Auto.jsImage, call.getBitmap()to convert.ratio{float} Scale ratio, default 1. Use a smaller value for small images if needed.- Returns {List} List of results. See
OcrResultbelow.
OCR recognition. The first call initializes automatically. See examples below.
Filtering
ocr.filterScore(results,dbnetScore,angleScore,crnnScore)
results{List} Results list to filterdbnetScore{float} Minimum confidence for “this region is text”angleScore{float} Minimum confidence for angle classificationcrnnScore{float} Minimum average confidence for text recognition- Returns {List} Filtered results list
Filters low-confidence results. Use 0 for a threshold to disable that filter.
OcrResult class
Each instance represents one OCR result item.
ocrResult.text
- {String} Recognized text
ocrResult.frame
- {List} Result polygon coordinates
The result is an arbitrary quadrilateral. It returns an integer list of length 8:[x1,y1,x2,y2,x3,y3,x4,y4] for the four vertices.
ocrResult.angleType
- {int} Text angle type
ocrResult.dbnetScore
- {float} Confidence for the region being text
ocrResult.angleScore
- {float} Confidence for the angle type
ocrResult.crnnScore
- {List} Per-character confidence list
Example
// Load plugin
ocr = $plugins.load("com.hraps.ocr")
// Load the image to recognize (set your own path)
img = images.read("./test.jpg")
// OCR
results = ocr.detect(img.getBitmap(),1)
console.info("Results before filter: " + results.size())
// Filter
results = ocr.filterScore(results,0.5,0.5,0.5)
// Print results
for(var i=0;i<results.size();i++){
var re = results.get(i)
log("Result:" + i + " Text:" + re.text + " Frame:" + re.frame + " AngleType:" + re.angleType)
log("Region score:" + re.dbScore + " Angle score:" + re.angleScore + " Text score:" + re.crnnScore + "\n")
}