Presto is a distributed SQL query engine for big data.
See the User Manual for deployment instructions and end user documentation.
We changed Presto log to be JSON to work with ElasticSearch.
Install log-manager
module. If log-manager
version no longer matches (check <dep.airlift.version><version></dep.airlift.version>
in the properties
section of the top-level pom.xml
file), it needs to be rebuilt from Airlift
project.
If you are rebuilding, git clone
Airlift
(depth=1, branch=) from here: https://github.com/airlift/airlift) and modify log-manager/src/main/java/io/airlift/log/StaticFormatter.java
:
After this line:
StringWriter stringWriter = new StringWriter()
Remove all of the following lines beginning with .append
except this one:
.append(record.getMessage());
Run the following command in the root folder of the Airlift
project to build just the log-manager
project:
mvn clean install -pl log-manager -DskipTests
Copy the log-manager-.jar file from the log-manager/target
folder to the dependencies
folder in the presto
project.
In the presto
project root folder:
mvn install:install-file -Dfile=dependencies/log-manager-0.178.jar -DgroupId=io.airlift -DartifactId=log-manager -Dversion=0.178 -Dpackaging=jar
https://help.github.com/en/articles/merging-an-upstream-repository-into-your-fork
Manually resolve issues merging checkr changes to FileBasedAccessControl.java to upstream presto
code:
- Accept in-coming changes
- Remove duplicate code
- Change class name of Identity parameters to ConnectorIdentity
- Remove unused imports
./mvnw clean install -DskipTests
After compiling presto-server
, copy files the following files to a local clone of preston
repo for testing:
From client-server/target
directory of presto
copy presto-server-<version>-SNAPSHOT.tar.gz
as presto-server.tar.gz
From presto-cli/target
directory of presto
copy presto-cli-<version>-SNAPSHOT-executable.jar
as cli.jar
See "To test a new presto build" notes in the Preston Dockerfile
After testing, copy files to S3: us-east-1-checkr-data-warehouse/binaries
We change file based permission to allow creation and dropping schema.
presto/presto-plugin-toolkit/src/main/java/com/facebook/presto/plugin/base/security/FileBasedAccessControl.java
- If not rebuilding presto-server, build
presto-plugin-toolkit
module:
mvn clean install -pl presto-plugin-toolkit -DskipTests
- If not rebuilding presto-server, build modules depending on
presto-plugin-toolkit
:
mvn clean install -pl presto-atop -DskipTests
mvn clean install -pl presto-hive -DskipTests
mvn clean install -pl presto-raptor -DskipTests
mvn clean install -pl presto-hive-hadoop2 -DskipTests
- Mac OS X or Linux
- Java 8 Update 151 or higher (8u151+), 64-bit. Both Oracle JDK and OpenJDK are supported.
- Maven 3.3.9+ (for building)
- Python 2.4+ (for running with the launcher script)
Presto is a standard Maven project. Simply run the following command from the project root directory:
./mvnw clean install
On the first build, Maven will download all the dependencies from the internet and cache them in the local repository (~/.m2/repository
), which can take a considerable amount of time. Subsequent builds will be faster.
Presto has a comprehensive set of unit tests that can take several minutes to run. You can disable the tests when building:
./mvnw clean install -DskipTests
After building Presto for the first time, you can load the project into your IDE and run the server. We recommend using IntelliJ IDEA. Because Presto is a standard Maven project, you can import it into your IDE using the root pom.xml
file. In IntelliJ, choose Open Project from the Quick Start box or choose Open from the File menu and select the root pom.xml
file.
After opening the project in IntelliJ, double check that the Java SDK is properly configured for the project:
- Open the File menu and select Project Structure
- In the SDKs section, ensure that a 1.8 JDK is selected (create one if none exist)
- In the Project section, ensure the Project language level is set to 8.0 as Presto makes use of several Java 8 language features
Presto comes with sample configuration that should work out-of-the-box for development. Use the following options to create a run configuration:
- Main Class:
com.facebook.presto.server.PrestoServer
- VM Options:
-ea -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -Xmx2G -Dconfig=etc/config.properties -Dlog.levels-file=etc/log.properties
- Working directory:
$MODULE_DIR$
- Use classpath of module:
presto-main
The working directory should be the presto-main
subdirectory. In IntelliJ, using $MODULE_DIR$
accomplishes this automatically.
Additionally, the Hive plugin must be configured with location of your Hive metastore Thrift service. Add the following to the list of VM options, replacing localhost:9083
with the correct host and port (or use the below value if you do not have a Hive metastore):
-Dhive.metastore.uri=thrift://localhost:9083
If your Hive metastore or HDFS cluster is not directly accessible to your local machine, you can use SSH port forwarding to access it. Setup a dynamic SOCKS proxy with SSH listening on local port 1080:
ssh -v -N -D 1080 server
Then add the following to the list of VM options:
-Dhive.metastore.thrift.client.socks-proxy=localhost:1080
Start the CLI to connect to the server and run SQL queries:
presto-cli/target/presto-cli-*-executable.jar
Run a query to see the nodes in the cluster:
SELECT * FROM system.runtime.nodes;
In the sample configuration, the Hive connector is mounted in the hive
catalog, so you can run the following queries to show the tables in the Hive database default
:
SHOW TABLES FROM hive.default;
We recommend you use IntelliJ as your IDE. The code style template for the project can be found in the codestyle repository along with our general programming and Java guidelines. In addition to those you should also adhere to the following:
- Alphabetize sections in the documentation source files (both in table of contents files and other regular documentation files). In general, alphabetize methods/variables/sections if such ordering already exists in the surrounding code.
- When appropriate, use the Java 8 stream API. However, note that the stream implementation does not perform well so avoid using it in inner loops or otherwise performance sensitive sections.
- Categorize errors when throwing exceptions. For example, PrestoException takes an error code as an argument,
PrestoException(HIVE_TOO_MANY_OPEN_PARTITIONS)
. This categorization lets you generate reports so you can monitor the frequency of various failures. - Ensure that all files have the appropriate license header; you can generate the license by running
mvn license:format
. - Consider using String formatting (printf style formatting using the Java
Formatter
class):format("Session property %s is invalid: %s", name, value)
(note thatformat()
should always be statically imported). Sometimes, if you only need to append something, consider using the+
operator. - Avoid using the ternary operator except for trivial expressions.
- Use an assertion from Airlift's
Assertions
class if there is one that covers your case rather than writing the assertion by hand. Over time we may move over to more fluent assertions like AssertJ. - When writing a Git commit message, follow these guidelines.
The Presto Web UI is composed of several React components and is written in JSX and ES6. This source code is compiled and packaged into browser-compatible Javascript, which is then checked in to the Presto source code (in the dist
folder). You must have Node.js and Yarn installed to execute these commands. To update this folder after making changes, simply run:
yarn --cwd presto-main/src/main/resources/webapp/src install
If no Javascript dependencies have changed (i.e., no changes to package.json
), it is faster to run:
yarn --cwd presto-main/src/main/resources/webapp/src run package
To simplify iteration, you can also run in watch
mode, which automatically re-compiles when changes to source files are detected:
yarn --cwd presto-main/src/main/resources/webapp/src run watch
To iterate quickly, simply re-build the project in IntelliJ after packaging is complete. Project resources will be hot-reloaded and changes are reflected on browser refresh.
When authoring a pull request, the PR description should include its relevant release notes. Follow Release Notes Guidelines when authoring release notes.