-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-45657 Add tests for consdb-hinfo and consdb-pqserver +200-65 #33
Conversation
|
||
|
||
@pytest.fixture | ||
def tmpdir(scope="module"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this get cleaned up, or should we yield/finally the passing of tmpdir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is not cleaned up automatically
|
||
|
||
@pytest.fixture | ||
def engine(tmpdir, scope="module"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own context & learning, this looks like it could be relevant in other test files where we need to create an sqlite engine. Do other test files exist yet that need that, should we generalize this one, should we just keep an eye on it for further test development?
print(f"{row=}") | ||
print(f"{row.exposure_name=}") | ||
|
||
assert _header_lookup(header, "OBSID") == row.exposure_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two other asserts look straight forward for names, but because I'm still fresh here, do you mind showing me how OBSID and exposure_name correlate? I can't seem to line them up by searching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata transaltor populates exposure_name with OBSID - the mapping dict is in hinfo.py
|
||
|
||
@pytest.fixture | ||
def tmpdir(scope="module"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is not cleaned up automatically
print(f"{row=}") | ||
print(f"{row.exposure_name=}") | ||
|
||
assert _header_lookup(header, "OBSID") == row.exposure_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata transaltor populates exposure_name with OBSID - the mapping dict is in hinfo.py
python-version: "3.11" | ||
cache: "pip" | ||
- name: Build docker image | ||
run: docker build -f Dockerfile.pytest -t pytest_image . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if this could be done in a container in the workflow: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#jobsjob_idcontainer
instead of building a new container and running in it.
The advantage would be that more of the test code is in the workflow file, where it's more visible, than in another Dockerfile.
RUN source loadLSST.bash && mamba install aiokafka httpx | ||
RUN source loadLSST.bash && pip install kafkit | ||
COPY python/lsst/consdb/hinfo.py python/lsst/consdb/utils.py ./hinfo/ | ||
RUN source loadLSST.bash && pip install kafkit aiokafka httpx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally try to avoid pip installing into conda environments whenever possible. Although it works pretty well nowadays, there's still some iffiness about the integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do this using only mamba, but it wouldn't cooperate. I think it's because we have to rely on newer versions of aiokafka that have not made their way to conda. My understanding is generally you can get away with mixing pip and conda as long as you never use conda again once you've used pip.
WORKDIR /home/lsst/ | ||
COPY --chown=lsst . ./consdb/ | ||
WORKDIR /home/lsst/consdb/ | ||
RUN source /opt/lsst/software/stack/loadLSST.bash && pip install -e . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this as well as the install above.
RUN source loadLSST.bash && pip install kafkit aiokafka httpx | ||
|
||
WORKDIR /home/lsst/ | ||
COPY --chown=lsst . ./consdb/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it copies too much for the hinfo service.
python/lsst/consdb/hinfo.py
Outdated
@@ -442,14 +443,16 @@ def get_kafka_config() -> KafkaConfig: | |||
|
|||
logger = setup_logging("consdb.hinfo") | |||
|
|||
instrument = os.environ["INSTRUMENT"] | |||
logger.info(f"Instrument = {instrument}") | |||
instrument = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to have a more specific error than the KeyError that would be returned if the instrument is not defined? I'm not sure that an "Unrecognized instrument: " (with no instrument name) error is much better.
Also, instrument = os.environ.get("INSTRUMENT", "")
may be more idiomatic.
try: | ||
yield tmpdir | ||
finally: | ||
shutil.rmtree(tmpdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can often be useful to specify ignore_errors=True
when using shutil.rmtree()
, especially when working on NFS filesystems that sometimes leave behind .nfsNNNN
files when removing things.
In this case, since it's for temporary test files that should be on local storage, this is not an issue.
conn.execute(f"INSERT INTO schemas VALUES ('{schema}', '{schema_path}')") | ||
with sqlite3.connect(schema_path) as schema_conn: | ||
schema_conn.executescript(sql.read_text()) | ||
schema_conn.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be handled by the context manager?
schema_conn.close() | ||
|
||
os.environ["POSTGRES_URL"] = f"sqlite:///{db_path}" | ||
hinfo.engine = utils.setup_postgres() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Maybe we should rename this method, since it's obviously not setting up postgres...
|
||
|
||
def _header_lookup(header, key): | ||
for line in header: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for tests, but for production code it might be better to load the array of key/value pairs into a Python dict for faster lookups.
os.environ["INSTRUMENT"] = "LATISS" | ||
tmpdir = Path(tempfile.mkdtemp()) | ||
try: | ||
yield tmpdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the yield actually ever fail? I think you could remove the try: finally:
(here and in the other test).
More testing still desirable for consdb, but this begins to cover hinfo.
Aside from code I'm adding two files to the test directory: updated cdb_latiss.sql freshly generated from felis, and a yaml file pulled from s3.