Honing the craft.

You are here: You're reading a post

Migrating to Python 3 on Google App Engine - Part 5 - Finishing touches before deploying to Python 3

In this part of the Migrating to Python 3 on Google App Engine series, I walk you through the last few migration steps that are concluding my blog application's journey to the App Engine Python 3 Standard Environment.

Making the code ready for Python 3

The first step of this last migration stage was making sure, that my application can run with the Python 3 interpreter.

As mentioned in the introductory post of this series, I was lucky that there wasn't much that I had to do in this domain. In fact I have only found one Python version compatibility related change in my Git history.

The iteritems dictionary method has been deprecated in Python 3, hence I had to replace it with the items method in my code, that has similar effect, although the implementation is slightly different behind the scenes:

# Python 2
sortedColl = sorted(coll.iteritems(), key=lambda x: x[1][0], reverse=True)
# Python 3
sortedColl = sorted(coll.items(), key=lambda x: x[1][0], reverse=True)

Well, as you can see, that wasn't too hard.

Updating the runtime configuration

The primary configuration controlling the runtime environment is defined in the app.yaml file, so in order to migrate to the Python 3 Standard Environment, I had to make a few changes there.

The runtime version information had to be updated, I had the following configuration defined for the Python 2 runtime:

# app.yaml
runtime: python27
api_version: 1
threadsafe: true
...

The api_version and threadsafe attributes had to be removed because they are deprecated in the new environment. After removing the attributes, I have updated the runtime attribute to this:

# app.yaml
runtime: python37
...

The next step was to update the handlers. The static handlers work the same way as for the Python 2 environment, so I have left those alone and since all my routing happens within the application now and I have used the standard main.py as the single entry point of the application, the handler configuration became very simple:

...
handlers:
  # my static handlers are here
  ...
  # my script handler
  - url: /.*
    script: auto

It is important to note, that if you have static handlers specified, you must go this route with script: auto set or specify an entrypoint element, but the latter requires the application developer to also provide the web server implementation, which must listen on port 8080.

The skip_files element gives you the ability to skip certain parts of your project when uploading the application to App Engine. This element is deprecated in the Python 3 environment and hence it should be removed completely. In the new environment, I will use the .gcloudignore file to reach the same effect, as I will explain below.

The libraries element and all its children have to be removed as well:

...
# remove this
libraries:
  ...

Dependencies are pulled in and resolved by using pip and the requirements.txt configuration file instead.

These changes above only work, if the related requirements I have shared earlier are already done. To sum these up:

  • The environment should be able to invoke the application by running main.py, which must provide a WSGI application object, by assigning it to the app variable.
  • All the routing should be done by the application (except for static URLs mentioned earlier).
  • You have to provide authentication and authorization within your application if you want to secure certain views.

Specifying dependencies

When my application was running in the Python 2 runtime environment, I had two choices if I wanted to use a third party library:

  • If the library was on the list of libraries built into App Engine, I just had to specify the library under the libraries element in app.yaml.
  • If the library wasn't readily available in App Engine, like the PyYAML library in my case, I had to bundle the library with my application code.

As one of the main goals of the Python 3 App Engine runtime environment's design was to rely as much on the standard Python tooling as feasible, App Engine has moved to using pip to pull in any package needed from PyPI automatically.

The most usual way of installing packages with pip is to use the command line, but it is also possible to specify the list of packages that need to be installed in a text file customarily called requirements.txt, which is then read when the pip command is run, and pip installs the packages listed as a result.

Creating the requirements file for App Engine

The latter approach is being used by App Engine: if you place the requirements.txt file in the root of your project, App Engine will read it when you deploy your project and automatically install the dependencies listed before starting the application.

Since the package requirements file is supported by pip out of the box, the format to be followed is in the pip documentation, but here is what I have in my requirements.txt:

google-cloud-ndb ~= 1.2
unidecode ~= 1.1
markdown ~= 3.2
pyramid ~= 1.10
pyramid_jinja2 ~= 2.8

Specifying the version is not mandatory, but recommended to avoid any surprises caused by package updates. It is a good idea to check on the libraries' latest development status occasionally and upgrade to the latest minor or major version when you are ready.

Removing legacy library loading

As the libraries are now installed using pip, there is no need to load bundled 3rd party libraries, which is done imperatively, using appengine_config.py in the Python 2 environment for App Engine. The App Engine runtime no longer runs this module in the Python 3 Standard Environment, and unless you have utility code imported from other modules of your application here, you can remove this file completely.

In my case, the instantiation of the Cloud NDB client occurs in this module as well as the WSGI middleware class for Cloud NDB is defined here, hence I have just deleted the following, unneeded, code from the module:

# module: appengine_config
import pkg_resources
from google.appengine.ext import vendor
...
# Add any libraries install in the "lib" folder.
path = 'pylibs'
vendor.add(path)
pkg_resources.working_set.add_entry(path)

# Six import workaround, not needed for Python 3
import six; reload(six)
...

Deleting the bundled libraries no longer used

The package dependencies are now pulled in from PyPI automatically, thus I was able to delete the directory where my bundled 3rd party libraries were stored.

Updating App Engine environment variables

I was using the SERVER_SOFTWARE environment variable in the Python 2 environment to detect if the application is running on App Engine or the local development server, to make connections to the local Datastore Emulator while working locally.

The set of available environment variables are different for the Python 3 environment, so I had to change my code to look for GAE_ENV instead, this is what I have changed said code snippet to:

# module: appengine_config
...
if os.getenv('GAE_ENV', '').startswith('standard'):
    pass
else:
    # The local datastore emulator details are not
    # read from the shell environment, hence  I need to
    # add them here manually.
    os.environ['DATASTORE_DATASET'] = 'redacted'
    os.environ['DATASTORE_EMULATOR_HOST'] = 'localhost:8081'
    os.environ['DATASTORE_EMULATOR_HOST_PATH'] = 'localhost:8081/datastore'
    os.environ['DATASTORE_HOST'] = 'http://localhost:8081'
    os.environ['DATASTORE_PROJECT_ID']= 'redacted'

You can find the previous state and reasoning behind this configuration in the part of this series explaining the Cloud NDB migration.

The .gcloudignore file

The syntax of the .gcloudignore file borrows heavily from that of .gitignore and the purpose is very similar. The developer can list path patterns, that should be excluded from the upload to the App Engine storage.

If this file doesn't exist, the gcloud commands that involve uploading assets to App Engine will automatically create it for the Python 3 environment, and the automatically generated content will be a set of minimal common sense exclude patterns for a Python project, like the __pycache__ directory and pip's setup.cfg. Git related metadata, like .git, .gitignore and .gcloudignore will be excluded by default as well.

However as I have several Git submodules and NPM modules pulled into the project, mainly for frontend related stuff, I wanted to avoid uploading all those to App Engine even once, hence I have created the .gcloudignore file and added all my paths I wanted to ignore, including the ones removed earlier from the skip_files section of the app.yaml file. This is how my .gcloudignore file looks like at the moment:

.gcloudignore
# Git related
.git
.gitignore
# Python
__pycache__/
/setup.cfg
# Specific to my project
node_modules
/libs
/node_modules

Testing the changes locally

For Python 3 App Engine the official documentation recommends simply running a WSGI server locally for testing (you have to have the package dependencies installed locally, preferably in a virtual environment before you can do that).

You would execute with the Gunicorn server within the project root directory in a similar manner for example:

gunicorn -b :$PORT main:app

However, since I have configuration in app.yaml related to serving static assets, I had to keep using dev_appserver.py as described in the documentation:

dev_appserver.py --application=PROJECT_ID app.yaml --port=9999

Where the application will listen to requests on port 9999, and PROJECT_ID should be replaced with the actual project name of course.

Although this setup works, it has some caveats:

  • Ironically, even though the development server will run the application in a Python 3 environment, it requires Python 2.7.12+ to run at the moment.
  • The latest development server does not support development of Python 3 apps on Windows (luckily we have WSL).

Deploying to App Engine

Since things worked fine locally, I was ready to deploy to App Engine. It is a good idea to test on App Engine first before directing your live traffic to the new version, so I have used the --no-promote command line option when deploying for the first time:

gcloud app deploy --no-promote

This way, I was able to test using a special URL, that has this structure: https://VERSION_ID-dot-default-dot-PROJECT_ID.REGION_ID.r.appspot.com where the VERSION_ID, PROJECT_ID and REGION_ID has to be populated according to the project and environment.

Since everything worked nicely, I have logged onto the Cloud Console and under App Engine / Versions I have selected the tested version and clicked Migrate traffic to make the application handle my production traffic.

Migration completed

With these steps, the migration, and this post series is concluded. Thank you for reading and I hope you managed to find useful information, that helped with your App Engine project. In case of any questions or corrections, please do not hesitate to hit me up on Twitter or via email. .