Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flex transformation updates #550

Merged
merged 31 commits into from
Oct 25, 2023
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
3cff05e
refactor(Update to various transform classes to accommodate flex):
Jul 20, 2023
47d1636
refactor(EditorControllerTest.java): Changed feed source to one that …
Jul 20, 2023
946461f
refactor(Unit test and gtfs-lib update): Update to use refactored sto…
Jul 24, 2023
368b98a
refactor(pom.xml): Reverted gtfs-lib back to latest dev-flex commit
Jul 24, 2023
70d9554
refactor(pom.xml): Changed gtfs-lib to reference latest stop times no…
Jul 25, 2023
e0f19d3
refactor(Test data update): Added missing commas to test data which w…
Jul 26, 2023
c3a3a71
refactor(Updated unit test): Created distinct feed for normalizing fi…
Aug 10, 2023
379809c
refactor(Corrected feed config): Deleted stop areas and re-instated s…
Aug 10, 2023
dd91fb9
refactor(NormalizeFieldTransformJobTest.java): Refactored to create m…
Aug 10, 2023
efc0a40
refactor(Bumped gtfs lib to latest dev-flex version.): Updated test d…
Aug 11, 2023
6ffa121
refactor(Added additional trips file): unit test failing validation b…
Aug 11, 2023
7a1f101
refactor(Reduce normalize fields unit test to just routes):
Aug 11, 2023
3199283
refactor(Re-instated areas and stop areas normalize field transformat…
Aug 11, 2023
bfa99eb
refactor(NormalizeFieldTransformJobTest): Updated fields to be normal…
Aug 11, 2023
a42d8b6
refactor(Created separate feed source for normalizing field unit tests):
Aug 11, 2023
76a63e1
refactor(Bug fixed and parameterized normalizing of field tests):
Aug 14, 2023
ada194b
refactor(NormalizeFieldTransformJobTest): Changed back to using a sin…
Aug 15, 2023
fdac958
refactor(NormalizeFieldTransformJobTest): Changed assertion.
Aug 15, 2023
d0f5421
refactor(NormalizeFieldTransformJobTest): Added table name to feed so…
Aug 15, 2023
f768a78
refactor(NormalizeFieldTransformJobTest): Changed test data to take c…
Aug 15, 2023
4e8fa0c
refactor(pom.xml): Bumped gtfs-lib version to stop time normalization…
Aug 16, 2023
8b8ddbb
refactor(pom.xml): Bumped gtfs lib to include stop times normalizatio…
Aug 16, 2023
f503fae
refactor(NormalizeFieldTransformJobTest): Restored correct feed sourc…
Aug 16, 2023
f1c65c2
Fixed merge conflict
Aug 18, 2023
34545cb
refactor(Bumped GTFS-lib version): Version now matches the latest dev…
Aug 21, 2023
e895aca
Update src/main/java/com/conveyal/datatools/manager/jobs/ProcessSingl…
br648 Sep 1, 2023
ae95fb0
refactor(ProcessSingleFeedJob):
Sep 1, 2023
738abc2
refactor(Reverted grammer changes to unit tests): Notification messag…
br648 Sep 6, 2023
f0890c7
refactor(Addressed PR feedback):
br648 Oct 25, 2023
1d70389
refactor(Fixed merge conflicts):
br648 Oct 25, 2023
a74c545
refactor(pom.xml): Corrected gtfs-lib version
br648 Oct 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@
<dependency>
<groupId>com.github.conveyal</groupId>
<artifactId>gtfs-lib</artifactId>
<version>5c383ab</version>
<version>32c084009930501ce992695be73a83ae7f7f767e</version>
<!-- Exclusions added in order to silence SLF4J warnings about multiple bindings:
http://www.slf4j.org/codes.html#multiple_bindings
-->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import com.conveyal.datatools.manager.models.transform.FeedTransformRules;
import com.conveyal.datatools.manager.models.transform.FeedTransformZipTarget;
import com.conveyal.datatools.manager.models.transform.ZipTransformation;
import com.conveyal.gtfs.validator.ValidationResult;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import org.slf4j.Logger;
Expand Down Expand Up @@ -198,15 +199,22 @@ private String getErrorReasonMessage() {
public String getNotificationMessage() {
StringBuilder message = new StringBuilder();
if (!status.error) {
message.append(String.format("New feed version created for %s (valid from %s - %s). ",
feedSource.name,
feedVersion.validationResult.firstCalendarDate,
feedVersion.validationResult.lastCalendarDate));
if (feedVersion.validationResult.errorCount > 0) {
message.append(String.format("During validation, we found %s issue(s)",
feedVersion.validationResult.errorCount));
ValidationResult validationResult = feedVersion.validationResult;
if (validationResult != null) {
message.append(String.format("New feed version created for %s (valid from %s - %s). ",
feedSource.name,
feedVersion.validationResult.firstCalendarDate,
feedVersion.validationResult.lastCalendarDate
br648 marked this conversation as resolved.
Show resolved Hide resolved
));
} else {
message.append("The validation check found no issues with this new dataset!");
message.append(String.format("New feed version created for %s.", feedSource.name));
}
if (validationResult != null) {
br648 marked this conversation as resolved.
Show resolved Hide resolved
if (validationResult.errorCount > 0) {
message.append(String.format("During validation, we found %s issue(s)", validationResult.errorCount));
} else {
message.append("The validation check found no issues with this new dataset!");
}
}
} else {
// Processing did not complete. Depending on which sub-task this occurred in,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -554,7 +554,9 @@ public void delete() {

feedStore.deleteFeed(id);
// Delete feed version tables in GTFS database
GTFS.delete(this.namespace, DataManager.GTFS_DATA_SOURCE);
if (this.namespace != null) {
GTFS.delete(this.namespace, DataManager.GTFS_DATA_SOURCE);
}
LOG.info("Dropped version's GTFS tables from Postgres.");
// Remove this FeedVersion from all Deployments associated with this FeedVersion's FeedSource's Project
// TODO TEST THOROUGHLY THAT THIS UPDATE EXPRESSION IS CORRECT
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import com.conveyal.datatools.common.status.MonitorableJob;
import com.conveyal.datatools.manager.utils.GtfsUtils;
import com.conveyal.gtfs.loader.Table;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonSubTypes;
import com.fasterxml.jackson.annotation.JsonTypeInfo;
Expand Down Expand Up @@ -100,8 +101,16 @@ public void doTransform(FeedTransformTarget target, MonitorableJob.Status status
*/
protected void validateTableName(MonitorableJob.Status status) {
// Validate fields before running transform.
if (GtfsUtils.getGtfsTable(table) == null) {
status.fail("Table must be valid GTFS spec table name (without .txt).");
if (GtfsUtils.getGtfsTable(table) == null && !Table.LOCATION_GEO_JSON_FILE_NAME.equals(table)) {
status.fail(String.format("Table must be valid GTFS spec table name (without .txt) or %s.", Table.LOCATION_GEO_JSON_FILE_NAME));
}
}

protected String getTableName() {
return Table.LOCATION_GEO_JSON_FILE_NAME.equals(table) ? table : table + ".txt";
}

protected String getTableSuffix() {
return Table.LOCATION_GEO_JSON_FILE_NAME.equals(table) ? ".geojson" : ".txt";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,11 @@ public static List<String> getInvalidSubstitutionPatterns(List<Substitution> sub

@Override
public void transform(FeedTransformZipTarget zipTarget, MonitorableJob.Status status) {
if (Table.LOCATION_GEO_JSON_FILE_NAME.equals(table)) {
// It's not possible to select the locations.geojson file from the normalize field transformation list. This
// is here in case that changes.
throw new UnsupportedOperationException("It is not possible to normalize geo json fields.");
}
String tableName = table + ".txt";
br648 marked this conversation as resolved.
Show resolved Hide resolved
try(
// Hold output before writing to ZIP
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,14 @@ private static HashMap<String, Map<String, String>> createCsvHashMap(CsvMapReade

@Override
public void transform(FeedTransformZipTarget zipTarget, MonitorableJob.Status status) throws Exception{
String tableName = table + ".txt";
String tableName = getTableName();
Path targetZipPath = Paths.get(zipTarget.gtfsFile.getAbsolutePath());
Optional<Table> streamResult = Arrays.stream(Table.tablesInOrder)
.filter(t -> t.name.equals(table))
.findFirst();

if (!streamResult.isPresent()) {
throw new Exception(String.format("could not find specTable for table %s", table));
if (streamResult.isEmpty()) {
throw new IOException(String.format("could not find specTable for table %s", table));
}
Table specTable = streamResult.get();

Expand All @@ -77,8 +77,8 @@ public void transform(FeedTransformZipTarget zipTarget, MonitorableJob.Status st

Path targetTxtFilePath = getTablePathInZip(tableName, targetZipFs);

final File tempFile = File.createTempFile(tableName + "-temp", ".txt");
File output = File.createTempFile(tableName + "-output-temp", ".txt");
final File tempFile = File.createTempFile(tableName + "-temp", getTableSuffix());
File output = File.createTempFile(tableName + "-output-temp", getTableSuffix());
int rowsModified = 0;
List<String> customFields;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ public void validateParameters(MonitorableJob.Status status) {
@Override
public void transform(FeedTransformZipTarget zipTarget, MonitorableJob.Status status) {
FeedVersion sourceVersion = getSourceVersion();
String tableName = table + ".txt";
String tableName = getTableName();
// Run the replace transformation
Path sourceZipPath = Paths.get(sourceVersion.retrieveGtfsFile().getAbsolutePath());
try (FileSystem sourceZipFs = FileSystems.newFileSystem(sourceZipPath, (ClassLoader) null)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ public void validateParameters(MonitorableJob.Status status) {

@Override
public void transform(FeedTransformZipTarget zipTarget, MonitorableJob.Status status) {
String tableName = table + ".txt";
String tableName = getTableName();
Path targetZipPath = Paths.get(zipTarget.gtfsFile.getAbsolutePath());
try (
FileSystem targetZipFs = FileSystems.newFileSystem(targetZipPath, (ClassLoader) null);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package com.conveyal.datatools.manager.jobs;

import com.conveyal.datatools.DatatoolsTest;
import com.conveyal.datatools.manager.auth.Auth0Connection;
import com.conveyal.datatools.manager.models.FeedRetrievalMethod;
import com.conveyal.datatools.manager.models.FeedSource;
import com.conveyal.datatools.manager.models.FeedVersion;
Expand All @@ -18,12 +19,18 @@
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.Date;
import java.util.stream.Stream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

Expand All @@ -34,20 +41,21 @@


public class NormalizeFieldTransformJobTest extends DatatoolsTest {
private static final Logger LOG = LoggerFactory.getLogger(NormalizeFieldTransformJobTest.class);
private static final String TABLE_NAME = "routes";
private static final String FIELD_NAME = "route_long_name";
private static Project project;
private static FeedSource feedSource;
private FeedVersion targetVersion;


/**
* Initialize Data Tools and set up a simple feed source and project.
*/
@BeforeAll
public static void setUp() throws IOException {
// start server if it isn't already running
DatatoolsTest.setUp();
Auth0Connection.setAuthDisabled(true);

// Create a project.
project = createProject();
Expand All @@ -58,6 +66,7 @@ public static void setUp() throws IOException {
*/
@AfterAll
public static void tearDown() {
Auth0Connection.setAuthDisabled(Auth0Connection.getDefaultAuthDisabled());
// Project delete cascades to feed sources.
project.delete();
}
Expand All @@ -75,50 +84,59 @@ public void tearDownTest() {
* Test that a {@link NormalizeFieldTransformation} will successfully complete.
* FIXME: On certain Windows machines, this test fails.
*/
@Test
public void canNormalizeField() throws IOException {
// Create transform.
// In this test, as an illustration, replace "Route" with the "Rte" abbreviation in routes.txt.
FeedTransformation<FeedTransformZipTarget> transformation = NormalizeFieldTransformationTest.createTransformation(
TABLE_NAME, FIELD_NAME, null, Lists.newArrayList(
new Substitution("Route", "Rte")
)
);
initializeFeedSource(transformation);
@ParameterizedTest
@MethodSource("createNormalizedFieldCases")
void canNormalizeField(TransformationCase transformationCase) throws IOException {
initializeFeedSource(transformationCase.table, createTransformation(transformationCase));

// Create target version that the transform will operate on.
targetVersion = createFeedVersion(
feedSource,
zipFolderFiles("fake-agency-with-only-calendar")
zipFolderFiles("fake-agency-for-field-normalizing")
);

try (ZipFile zip = new ZipFile(targetVersion.retrieveGtfsFile())) {
// Check that new version has routes table modified.
ZipEntry entry = zip.getEntry(TABLE_NAME + ".txt");
assertNotNull(entry);

// Scan the first data row in routes.txt and check that the substitution
// that was defined in setUp was done.
try (
InputStream stream = zip.getInputStream(entry);
InputStreamReader streamReader = new InputStreamReader(stream);
BufferedReader reader = new BufferedReader(streamReader)
) {
String[] columns = reader.readLine().split(",");
int fieldIndex = ArrayUtils.indexOf(columns, FIELD_NAME);

String row1 = reader.readLine();
String[] row1Fields = row1.split(",");
assertTrue(row1Fields[fieldIndex].startsWith("Rte "), row1);
}
// Check that new version has expected modifications.
checkTableForModification(zip, transformationCase);
}
}

private static Stream<Arguments> createNormalizedFieldCases() {
return Stream.of(
Arguments.of(new TransformationCase("routes", "route_long_name", "Route", "Rte")),
Arguments.of(new TransformationCase("booking_rules", "pickup_message", "Message", "Msg")),
Arguments.of(new TransformationCase("areas", "area_name", "Area", "Place"))
);
}

private void checkTableForModification(ZipFile zip, TransformationCase transformationCase) throws IOException {
String tableName = transformationCase.table + ".txt";
LOG.info("Getting table {} from zip {}", tableName, zip.getName());
// Check that the new version has been modified.
ZipEntry entry = zip.getEntry(tableName);
assertNotNull(entry);

// Scan the first data row and check that the substitution that was defined in the set-up was done.
try (
InputStream stream = zip.getInputStream(entry);
InputStreamReader streamReader = new InputStreamReader(stream);
BufferedReader reader = new BufferedReader(streamReader)
) {
String[] columns = reader.readLine().split(",");
int fieldIndex = ArrayUtils.indexOf(columns, transformationCase.fieldName);

String rowOne = reader.readLine();
assertNotNull(rowOne, String.format("First row in table %s is null!", transformationCase.table));
String[] row1Fields = rowOne.split(",");
assertTrue(row1Fields[fieldIndex].contains(transformationCase.replacement), rowOne);
}
}

/**
* Test that a {@link NormalizeFieldTransformation} will fail if invalid substitution patterns are provided.
*/
@Test
public void canHandleInvalidSubstitutionPatterns() throws IOException {
void canHandleInvalidSubstitutionPatterns() throws IOException {
// Create transform.
// In this test, we provide an invalid pattern '\Cir\b' (instead of '\bCir\b'),
// when trying to replace e.g. 'Piedmont Cir' with 'Piedmont Circle'.
Expand All @@ -127,7 +145,7 @@ public void canHandleInvalidSubstitutionPatterns() throws IOException {
new Substitution("\\Cir\\b", "Circle")
)
);
initializeFeedSource(transformation);
initializeFeedSource(TABLE_NAME, transformation);
br648 marked this conversation as resolved.
Show resolved Hide resolved

// Create target version that the transform will operate on.
targetVersion = createFeedVersion(
Expand All @@ -140,25 +158,50 @@ public void canHandleInvalidSubstitutionPatterns() throws IOException {
assertTrue(targetVersion.hasCriticalErrors());
}

private FeedTransformation<FeedTransformZipTarget> createTransformation(TransformationCase transformationCase) {
return NormalizeFieldTransformationTest
.createTransformation(
transformationCase.table,
transformationCase.fieldName,
null,
Lists.newArrayList(
new Substitution(transformationCase.pattern, transformationCase.replacement)
)
);
}

/**
* Create and persist a feed source using the given transformation.
*/
private static void initializeFeedSource(FeedTransformation<FeedTransformZipTarget> transformation) {
FeedTransformRules transformRules = new FeedTransformRules(transformation);
private void initializeFeedSource(String table, FeedTransformation<FeedTransformZipTarget> transformation) {

// Create feed source with above transform.
feedSource = new FeedSource("Normalize Field Test Feed", project.id, FeedRetrievalMethod.MANUALLY_UPLOADED);
feedSource = new FeedSource(table + " Normalize Field Test Feed", project.id, FeedRetrievalMethod.MANUALLY_UPLOADED);
feedSource.deployable = false;
feedSource.transformRules.add(transformRules);
feedSource.transformRules.add(new FeedTransformRules(transformation));
Persistence.feedSources.create(feedSource);
}

// FIXME: Refactor (almost same code as AutoDeployJobTest in PR #361,
// and some common code with PersistenceTest#createProject).
private static Project createProject() {
Project project = new Project();
project.name = String.format("Test Project %s", new Date().toString());
project.name = String.format("Test Project %s", new Date());
Persistence.projects.create(project);
return project;
}

private static class TransformationCase {
public String table;
public String fieldName;
public String pattern;
public String replacement;

public TransformationCase(String table, String fieldName, String pattern, String replacement) {
this.table = table;
this.fieldName = fieldName;
this.pattern = pattern;
this.replacement = replacement;
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
area_id,area_name
Area One,"This is area one"
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
booking_rule_id,booking_type,prior_notice_duration_min,prior_notice_duration_max,prior_notice_last_day,prior_notice_last_time,prior_notice_start_day,prior_notice_start_time,prior_notice_service_id,message,pickup_message,drop_off_message,phone_number,info_url,booking_url
Booking 1,1,30,60,1,17:00:00,7,00:00:00,"04100312-8fe1-46a5-a9f2-556f39478f57","This is a message","This is a pickup message","This is a drop off message","123456789","http://www.info_url.com","http://www.booking_url.com"
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
agency_id,route_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color,route_branding_url
1,1,1,Route 1,,3,,7CE6E7,FFFFFF,
1,2,2,Route 2,,3,,7CE6E7,FFFFFF,
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
stop_id,stop_code,stop_name,stop_desc,stop_lat,stop_lon,zone_id,stop_url,location_type,parent_station,stop_timezone,wheelchair_boarding
4u6g,4u6g,Butler Ln,,37.0612132,-122.0074332,,,0,,,
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
route_id,trip_id,trip_headsign,trip_short_name,direction_id,block_id,shape_id,bikes_allowed,wheelchair_accessible,service_id
1,only-calendar-trip1,,,0,,,0,0,common_id
2,only-calendar-trip2,,,0,,,0,0,common_id
Empty file.