nijk

command module
v0.0.0-...-2dd3dea Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2022 License: MIT Imports: 15 Imported by: 0

README

Nijk

Nijk is for helping programmers come up with good names. Nijk analyzes some of the most popular projects to suggest good names for given context.

Using Nijk is simple. Just visit https://nijk-225007.appspot.com/, select a preset. Currently, there is only one preset for Python. Put a term that you are considering for your Python code. Nijk will suggest some potential better names. You can also go deeper from the suggested terms.

Currently, the demo server is stopped due to the maintenance cost. You can watch the demo video instead.

encode

Nijk is named after the common non-meaningful variable names n, i, j, and k, to remind programmers of the importance of good names. It is also easy to pronounce; just like Nick[nɪk].

Architecture

Architecture

Quick Start

Nijk is written with multiple tools and programming languages. To generate a dump, which is the overall output of the Local phase, you will need:

  • Python 3.6 or later
  • Go 1.11 or later
Python

The Python dependencies are managed with Pipenv. If you are familiar with Pipenv, you can simply run pipenv install. You can also configure your own Virtualenv and just run pip install -r requirements.txt.

Go

Go packages for Nijk is organized with Go Modules, which was introduced in Go 1.11. Using GO 1.11 or later, there is nothing that you need to do explicitly.

Compile a collection

To compile a collection, you need to write a preset under the presets directory. For now, there is only a preset for Python, which is presets/python.txt. Of course, it is also possible to write your own preset, like presets/*.txt. Once you are already, run:

scripts/scripts/compile_collection.py python

or simply,

make collections/python.txt

The script will download the source code of the projects specified in the preset, and extract contexts from them into collections/python.txt.

Analyze and Dump as SQL

Once the collection is compiled, you can run the scorer to analyze the collection. Run the following command:

go run ./scorer/cmd 'python' < 'collections/python.txt' > 'dumps/python.sql'

or simply,

make dumps/python.sql

The dump data will look like:

INSERT INTO `python_paradigmatic` VALUES ("encode", "hex_encode", 0.51461);

The dump SQL represents, "Based on the analysis of the preset Python, the paradigmatic relation score between encode and hex_encode is 0.51461." To import this dump, you need the table definition for this, which is created by the Go package scorer/cmd/schema.

Cloud SQL and App Engine

Nijk provides the analysis result as a web app, https://nijk-225007.appspot.com. To make this possible, Nijk is deployed to Google Cloud Platform.

Even though the source code is open-sourced, actually deploying it is private. For this reason, some configurations have been hard-coded, including the importing script scripts/import-dump.sh.

The Cloud SQL service and some permissions related to import the dump file is written in Terraform, in infra. The App Engine app is deployed by directly using Google Cloud SDK.

Presets

A preset is basically a list of projects to be used as ideal naming examples. It is also the basic analysis unit of Nijk. Based on a preset, Nijk extracts identifiers from source code in projects listed, to compose a collection of contexts.

For more details, see presets.

Extractors

An extractor is responsible for extracting contexts from a project. Currently, Nijk supports Python only.

For more details, see extractors.

Collection

scripts/compile_collection.py first downloads projects specified in a preset. And then, it executes extractors on those projects. Finally, it concatenates the outputs of the executions.

For more details, see collections.

Scorer

Scorer is the core of Nijk. It reads a collection and run Paradigmatic and Syntagmatic Relation Discovery algorithm based on normalized-BM25. Scorer is implemented in Go. It also has a command-line interface which generates SQL dump queries to be imported to a MySQL server.

For more details, see package scorer.

Google Cloud Platform

As mentioned before, the web app Nijk is deployed on Google Cloud Platform (GCP). Though some configurations hare hard-coded and you cannot deploy it yourself as it is, since I tried to do all the infrastructure settings as code, you can check almost every detail of the setup.

For the database, Cloud SQL, all the configurations are written in Terraform code, in infra/main.tf. You can also check app.go and app.yaml for the details of the App Engine web application.

In most cases, To deploy Nijk, I simply run following commands to deploy Nijk:

# To update the DB with newly generated 'dumps/python.sql'
scripts/import-dump.sh 'python'

# To run the dev server of the web app
make dev

# To deploy the web app
make deploy

Caveat

This is my([email protected]) course project for Text Information Systems of MCS-DS.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
Package scorer analyzes a collection to discover paradigmatic and syntagmatic relation.
Package scorer analyzes a collection to discover paradigmatic and syntagmatic relation.
cmd
Package main runs scorer on os.Stdin and prints SQLs to Os.Stdout for generating a dump file of the analysis result.
Package main runs scorer on os.Stdin and prints SQLs to Os.Stdout for generating a dump file of the analysis result.
cmd/schema
Package main is to create tables for importing data generated by scorer/cmd.
Package main is to create tables for importing data generated by scorer/cmd.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL