AWS CodeWhisperer: setting up and first tests

in #engrave2 years ago

Everyone has already heard of or use GitHub's Copilot or ChatGPT to build their next billion dollars idea. I am a boring java dev and have spent the last few months (racing DeepRacer, working, helping the community, baking and) in a cave, under a rock. But now the AWS have released the CodeWhisperer and so the time is ripe for me to try something. It's good to know what will make me extinct, apparently.

AWS CodeWhisperer

It's an ML backed tool where you say what you want and it generates code for you. From what I've heard these tools can speed up the work but still require the human part of the equation to grasp the problem to be solved. I mean, it will build a function to upload a file into an S3 bucket but won't solve how it fits into a say event based system parsing files submitted to S3 from a website to trigger notifications and run analytics to detect a certain characteristic of the data. I hope.

I would like to try and play a little bit with it so let me set it up.
I will be using JetBrains IntelliJ to which I will install the AWS Toolkit and will get things rolling.

Install AWS Toolkit

It's as easy as:

  • open the settings
  • go to plugins
  • find AWS Toolkit
  • Install
  • restart

For the less exhibitionist individuals I recommend going into the settings after a restart and disabling the telemetry.

Enabling AWS CodeWhisperer

You may have gone into settings to see that it expects you to be logged in but gives no option to do that. I decided to use the AWS Builder ID which is not attached to an AWS account.

To enable CodeWhisperer you will need to go out of settings if you've got them open, then open the AWS Toolkit pane, select Developer Tools and double click the Start command under CodeWhisperer.

A screenshot showing the lower left corner of IntelliJ where AWS Toolkit is open showing Developer Tools and allowing starting of the CodeWhisperer

You will be (finally) given a chance to authenticate:

A screenshot of the authentication window allowing to log in with a Builder ID

Click connect, it will ask you to copy a code and open a website to authenticate. In there it will ask you for permissions:

A dialog asking for your permission and explaining what the permissions will be used for

Once you return to your IntelliJ, you will be asked to accept Terms of Service. It will outline what data will get collected and remind you that it's a beta and might go away at any time.

Terms of Service approval window for AWS CodeWhisperer

Finally, again for the less exhibitionist individuals, I recommend going into Settings -> Tools -> AWS -> CodeWhisperer and disabling now editable option to share CodeWhisperer content with AWS:

CodeWhisperer settings in IntelliJ

First test

I have created an empty file test.py and started typing:

A comment in a code file saying Function to print out a funny quote

CodeWhisperer immediately suggested inserting code. I selected the option to see:

def funny_quote():

Not quite the solution yet. Then I pressed enter at the end of the function definition and received a code insertion suggestion which I accepted:

print("This is a funny quote")

Well played, CodeWhisperer. I like your sense of humour. I continued onto the new line only to notice that I was getting same suggestion over and over again, and the Enter key kept inserting it. Once I broke out of the suggestion loop, I have received a suggestion that I run my function.

Then I tried to write a function that would open a graphical window. I wrote the first comment below, the rest was suggested by the CodeWhisperer:

# Function to open a graphical window
def open_window():
    print("Opening window")
    # Open a window here
    # ...
    # ...

Maybe I'm expecting too much, let's list files in current directory:

# Function to list files in current directory
def list_files():
    print("Listing files")
    # List files here...
    # ...
    # ...
    # ...

At this point I think I should read the manual.

I have reverted to a video:

It showed me a bunch of hints on how to talk to CodeWhisperer. While I still did not get a running code, I have noticed something nice: CodeWhisperer does learn what I have prepared and suggests what to do next. I tried typing "Write a function" and it suggested a function to list files in a directory and to show that in a graphical window:

# Function to open a graphical window
def open_window():
    print("Opening window")
    # Open a window here...
    # ...


# Function to list files in current directory
def list_files():
    print("Listing files")
    # List files here...
    # ...


# Write a function to list files in current directory
# and open a graphical window
def list_files_and_open_window():
    list_files()
    open_window()

Maybe I'm being a little too vague and that's how I'm getting vague results?

So I've tried something from an example in


I have added two imports and first two lines of the comment, the third was suggested to me.
I have then been given five suggestions of the function. One was an outline of the whole function and four had actual code. I chose this one as it was opening file with the with syntax which I really like as it takes care of the open resources. There were two code snippets that did that and two that did not. The other one did not nest one code to be opened in the other. This is acceptable in Python because the scoping of python variables does not depend on nesting. I'm a Java guy, I find scoping a little more explicit.

import json
import csv

# write a function to convert a json file to a csv file
# json keys are csv columns
# json values are csv rows
def convert_json_to_csv(json_file_name, csv_file_name):
    print("Converting json to csv")
    # Open json file
    with open(json_file_name) as json_file:
        # Read json file
        json_data = json.load(json_file)

        # Open csv file
        with open(csv_file_name, 'w') as csv_file:
            # Create csv writer
            csv_writer = csv.writer(csv_file)

            # Write csv header
            csv_writer.writerow(json_data.keys())

            # Write csv rows
            csv_writer.writerows(zip(*json_data.values()))

    print("Done converting json to csv")

Not bad. One thing that's worth repeating from Vasi Philomin's demo this example comes from: CodeWhisperer is a companion tool. It will not lead the way. Its goal is to save you time digging through docs to find that one way to call that one function that you kind of know and know where it is but can't be bothered to remember all the details for its usage.

Final test for the day

Equipped in this hint I imported csv and created a new comment. Say I am gathering csv files with racing data for AWS DeepRacer. For a given race I want to compare the current and previous files and say when there is something new in them (new racers joining the race) or when there is a change for an existing racer (someone submitted a model). Here's my comment and an outcome selected from the suggestions:

import csv 


# write a function to load two csv files with header rows
# order rows by racerID column
# match rows by racerID
# if racerID is the same, print out the differences
# if racerID is different, print out the row from the second file, load a new row from the second file and compare with the current row from the first file
def compare_csv_files(csv_file1_name, csv_file2_name):
    print("Comparing csv files")
    # Open csv files
    with open(csv_file1_name) as csv_file1:
        with open(csv_file2_name) as csv_file2:
            # Create csv readers
            csv_reader1 = csv.reader(csv_file1)
            csv_reader2 = csv.reader(csv_file2)

            # Read csv header
            csv_header1 = next(csv_reader1)
            csv_header2 = next(csv_reader2)

            # Check if headers are the same
            if csv_header1 != csv_header2:
                print("Headers are different")
                print("csv_header1: ", csv_header1)
                print("csv_header2: ", csv_header2)
                return

            # Create csv dict readers
            csv_dict_reader1 = csv.DictReader(csv_file1, fieldnames=csv_header1)
            csv_dict_reader2 = csv.DictReader(csv_file2, fieldnames=csv_header2)
            
            # Read csv rows
            csv_row1 = next(csv_dict_reader1)
            csv_row2 = next(csv_dict_reader2)
            
            # Compare csv rows
            while csv_row1 and csv_row2:
                # Check if racerID is the same
                if csv_row1['racerID'] != csv_row2['racerID']:
                    print("racerID is different")
                    print("csv_row1: ", csv_row1)
                    print("csv_row2: ", csv_row2)
                    csv_row2 = next(csv_dict_reader2)
                    continue

                # Check if any of the values are different
                for key in csv_row1:
                    if csv_row1[key] != csv_row2[key]:
                        print("Values are different")
                        print("csv_row1: ", csv_row1)
                        print("csv_row2: ", csv_row2)
                        break

                # Load next row from csv_file2
                csv_row2 = next(csv_dict_reader2)

                # Load next row from csv_file1
                csv_row1 = next(csv_dict_reader1)
                

Things got a bit dense. It seems to be an almost valid code. I may have had a suggestion to just use a DictReader in which case headers would be read from the first lines of the csv files but I would not get the comparison there. Note that my request to treat an unmatched row in file 2 as a new entry and to keep the line in file 1 to compare with the next row has been handled.

The only thing that's wrong is an assumption that the rows have been ordered by racerId. But then some other suggestions offered storing the rows in dictionaries which would cover that bit, I just missed the fact that there was no sorting. One of the suggestions had it and then I missed that bit.

CodeWhisperer did not suggest the whole method at first. It was only about 10-15 rows after which it kept suggesting more. At first I found it confusing but then it started making sense: I would not grasp the whole method in one go so it suggested what could fit in the available bit of screen and then I kept reviewing as the code was being suggested.

Conclusion

It's half amazing and half scary. I can imagine the outcome of such a tool lead someone to thinking I wasn't needed anymore as a developer. I don't think they would be right at this point. Such tools can at some point cut the part of work based on recreating the behaviour and are/will be very handy at that. Coding is about more than recreating. But if we join forces, the abilities will grow and results can be pretty amazing.


Originally posted on Code Like A Mother. Hive blog powered by ENGRAVE.