Using python and graph to explain how Machine learning helps Japan to implement self-driving taxi in 2020 Olympics [with code]

Although there are still three years to 2020 Tokyo Olympic Games, but Japan is preparing for this event in full swing. One of the preparation is to develop a team of programming self-driving taxi.

Japanese automated car company ZMP has announced that it will work with Hinomaru Kotsu, a Tokyo-based taxi company, to develop a mechanical taxi. ZMP has been developing its own automated driving technology in recent years, including hardware and software.

Since last year, ZMP has been testing its automatic driving technology on the streets, and research will be applied to the taxi. Sources said the taxi will be used as for the travelers in Tokyo as a shuttle tool. The company is still testing their car by way of human drivers' surveillance and hope to achieve automatic driving without human drivers monitoring by the end of this year.

Technology behind

While the company will never disclose the software they use, I would say that they must build the self-driving software by PyTorch, Caffe2 or Tensorflow. These are all deep learning libraries which enable programmer to execute similar machine learning algorithms.

Today I would like to demonstrate the tech by using Tensorflow.

Object Detection with TensorFlow Object Detection API

If you are professional programmer, please check below

Official blog from Google: https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html
Code: https://github.com/tensorflow/models/blob/master/object_detection/object_detection_tutorial.ipynb

Self-driving is not really a problem even without deep learning or machine learning, the real problem is in Object Detection. Will the car hit a Pedestrian, or traffic light?

There are several state-of-the-art techniques we can use. Here is a selection of the most popular detection models.

Single Shot Multibox Detector (SSD) with MobileNets
SSD with Inception V2
Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101
Faster RCNN with Resnet 101
Faster RCNN with Inception Resnet v2

Simple Explanation

First, you need to understand that there are four types of image processing tasks in general

We are focusing on the second one, "Classification + Localization"

So video in it's basic form is just a series of picture. For every picture in video (we call it frame, like 60 fps is equal to 60 pictures per second). We will make the coordinates of the cat in the picture, so there will be 4 numbers on this 2D picture in order to make a rectangle around the object (the cat).

So finally, our data can be described as a lot of pairs of the picture together with 4 number of coordinates.

What are these boxes between the cat and the arrow?

I am glad you ask. They generally would include Convolution and Maxpool, here is an illustration.

Convolution and Maxpool: Technique used in image processing (applications including Alpha Go) to parse an image into a smaller frame. This enable the model to consider fewer pixels for decision making while keeping a good result (In most case even better).

Want to know more about image processing? Check out my blog about Self-piloting AI: https://steemit.com/gaming/@jimsparkle/using-ai-to-self-piloting-x-wing-in-star-wars-battlefront-with-howto-and-code

Training

So the machine will first try to generate 4 (kind of) random numbers at the end, comparing it with the true coordinates, and adjust the model to produce a better number next time.

If we leave the machine training days by itself, the result will be very good. And this sums up how Facebook detects our face, Tesla detects traffic light, and FBI detects you! (<-joke)

Your support is tremendous to my research and report in machine learning and AI. Please follow, upvote with 100000% and I will keep sharing according to your interest (Let me know in the comment!)

Sort:

Trending

[-]

kenchung (66) 7 years ago

A really great article! Image processing is really an interesting topic. I am sure auto cars will be our future!

$1.44

8 votes

guyverckw (70) 7 years ago (edited)

Haven't read throught this post yet. Is this one English version of the other Chinese post? For me, to understand technological things, it is easier to be in English than in Chinese. Let me read this one instead. But it is always good to have different langauges to support different cultural backgrounds. Thanks.

$1.43

linuslee0216 (69) 7 years ago (edited)

Hello Jim, your article is very great! Your article keeping interesting as the previous. You must be very hardworking! They are nice and with high quality! I like to read them very much. Keep it. Looking forward to see more good post from you!

$0.12

randowhale (67) 7 years ago

This post received a 4.2% upvote from @randowhale thanks to @jimsparkle! For more information, click here!

$0.05

1 vote

soonidrift (49) 7 years ago

Great article,i really enjoyed it

$0.00

wilkinshui (69) 7 years ago

Hi jim you are one of my favourite writer in steemit. I hope you would keep posting such quality information. By the way if i want to do some financial backtest, would you say it is better to do it in R or Python and where is the best way to get started?

moneybags (49) 7 years ago

Any suggestions on where a complete beginner should start learning python?