Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update download URLs for SDSS DR18 #3017

Merged
merged 1 commit into from
Jun 28, 2024
Merged

Conversation

weaverba137
Copy link
Member

@weaverba137 weaverba137 commented May 31, 2024

This PR closes #3011.

Background: SDSS-V has changed the paths used to access data files. Both imaging and spectroscopic data are affected. As of SDSS-V/DR18, the SQL access does not appear to have changed. Updating these URLs may continue to work throughout SDSS-V, but that is not guaranteed. The SDSS-V data team recommends using sdss-access.

  • Update URL for imaging.
  • Double-check that 5-digit plate numbers do work.
  • Update URL for SDSS-IV and earlier spectroscopy.
  • Would existing queries in astroquery.sdss ever return a SDSS-V spectrum?
  • Update URL for SDSS-V spectroscopy, fieldID-MJD-fiberID instead of plate-MJD-fiberID.
  • Update tests.

Copy link

codecov bot commented May 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.30%. Comparing base (091e6b3) to head (32d1a3d).
Report is 14 commits behind head on main.

Current head 32d1a3d differs from pull request most recent head b014ad0

Please upload reports for the commit b014ad0 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3017      +/-   ##
==========================================
- Coverage   67.60%   67.30%   -0.31%     
==========================================
  Files         233      231       -2     
  Lines       18271    18280       +9     
==========================================
- Hits        12352    12303      -49     
- Misses       5919     5977      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pep8speaks
Copy link

pep8speaks commented Jun 10, 2024

Hello @weaverba137! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-06-28 00:05:38 UTC

@bsipocz bsipocz added the sdss label Jun 11, 2024
@bsipocz bsipocz added this to the v0.4.8 milestone Jun 11, 2024
@weaverba137 weaverba137 marked this pull request as ready for review June 12, 2024 07:11
@weaverba137
Copy link
Member Author

This is ready for review. Before merging there is an important question to consider: The get_spectra() method should be capable of downloading any SDSS spectrum file, including a new survey in DR18 called "eFEDS". However, the existing queries will not return these spectra because their metadata are stored in a different table which is not currently accessed. As a workaround, users can pass a matches Table which has the necessary metadata (this is not new functionality).

I've been informed that in DR19 there will be a unified way to access all SDSS/BOSS/eBOSS/eFEDS spectra in a single table, so I propose possibly documenting the matches workaround (I haven't done that yet) and waiting for DR19 before changing any queries.

Summary of changes:

  • I've already run a test with --remote-data any, including DR18, with no errors.
  • I've re-enabled some tests that were marked as xfail long-long ago. Please take a look at those in the diff.
  • Cleaned up some unused imports and function arguments.

I can squash commits if necessary. There was a bit of churn getting tests to work.

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all look great, than you so much for the fix.

Two things, neither of which I would hold this PR up for:

  • Can/should we change the default DR to 18?
  • Running the remote tests I got a different exception for test_query_timeout
================================================================================= FAILURES =================================================================================
____________________________________________________________________ TestSDSSRemote.test_query_timeout _____________________________________________________________________

self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x11bd9d430>, conn = <urllib3.connection.HTTPSConnection object at 0x11b2b0b00>, method = 'GET'
url = '/dr17/en/tools/search/x_results.aspx?cmd=+SELECT%0D+p.ra%2C+p.dec%2C+p.objid%2C+p.run%2C+p.rerun%2C+p.camcol%2C+p.fie...8%28%28p.ra+BETWEEN+2.02319+AND+2.02374%29+AND+%28p.dec+BETWEEN+14.8395+AND+14.8401%29%29%29&format=csv&searchtool=SQL'
timeout = Timeout(connect=0.01, read=0.01, total=None), chunked = False
httplib_request_kw = {'body': None, 'headers': {'User-Agent': 'astroquery/0.4.8.dev9345_testrun_testrun Python/3.12.1 (Darwin) python-reque...=HCNLOJOANGIBHJGAJMLLPOKI; ASPSESSIONIDSQSCCQRQ=ICNLOJOAOBLOGAIACNDFMJJD; ASP.NET_SessionId=nlotaiz2ednb1acriy14pjms'}}
timeout_obj = Timeout(connect=0.01, read=0.01, total=None), read_timeout = 0.01

    def _make_request(
        self, conn, method, url, timeout=_Default, chunked=False, **httplib_request_kw
    ):
        """
        Perform a request on a given urllib connection object taken from our
        pool.
    
        :param conn:
            a connection from one of our connection pools
    
        :param timeout:
            Socket timeout in seconds for the request. This can be a
            float or integer, which will set the same timeout value for
            the socket connect and the socket read, or an instance of
            :class:`urllib3.util.Timeout`, which gives you more fine-grained
            control over your timeouts.
        """
        self.num_requests += 1
    
        timeout_obj = self._get_timeout(timeout)
        timeout_obj.start_connect()
        conn.timeout = Timeout.resolve_default_timeout(timeout_obj.connect_timeout)
    
        # Trigger any extra validation we need to do.
        try:
            self._validate_conn(conn)
        except (SocketTimeout, BaseSSLError) as e:
            # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.
            self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
            raise
    
        # conn.request() calls http.client.*.request, not the method in
        # urllib3.request. It also calls makefile (recv) on the socket.
        try:
            if chunked:
                conn.request_chunked(method, url, **httplib_request_kw)
            else:
                conn.request(method, url, **httplib_request_kw)
    
        # We are swallowing BrokenPipeError (errno.EPIPE) since the server is
        # legitimately able to close the connection after sending a valid response.
        # With this behaviour, the received response is still readable.
        except BrokenPipeError:
            # Python 3
            pass
        except IOError as e:
            # Python 2 and macOS/Linux
            # EPIPE and ESHUTDOWN are BrokenPipeError on Python 2, and EPROTOTYPE is needed on macOS
            # https://erickt.github.io/blog/2014/11/19/adventures-in-debugging-a-potential-osx-kernel-bug/
            if e.errno not in {
                errno.EPIPE,
                errno.ESHUTDOWN,
                errno.EPROTOTYPE,
            }:
                raise
    
        # Reset the timeout for the recv() on the socket
        read_timeout = timeout_obj.read_timeout
    
        # App Engine doesn't have a sock attr
        if getattr(conn, "sock", None):
            # In Python 3 socket.py will catch EAGAIN and return None when you
            # try and read into the file pointer created by http.client, which
            # instead raises a BadStatusLine exception. Instead of catching
            # the exception and assuming all BadStatusLine exceptions are read
            # timeouts, check for a zero timeout before making the request.
            if read_timeout == 0:
                raise ReadTimeoutError(
                    self, url, "Read timed out. (read timeout=%s)" % read_timeout
                )
            if read_timeout is Timeout.DEFAULT_TIMEOUT:
                conn.sock.settimeout(socket.getdefaulttimeout())
            else:  # None or a value
                conn.sock.settimeout(read_timeout)
    
        # Receive the response from the server
        try:
            try:
                # Python 2.7, use buffering of HTTP responses
                httplib_response = conn.getresponse(buffering=True)
            except TypeError:
                # Python 3
                try:
                    httplib_response = conn.getresponse()
                except BaseException as e:
                    # Remove the TypeError from the exception chain in
                    # Python 3 (including for exceptions like SystemExit).
                    # Otherwise it looks like a bug in the code.
>                   six.raise_from(e, None)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:467: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:462: in _make_request
    httplib_response = conn.getresponse()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:1419: in getresponse
    response.begin()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:331: in begin
    version, status, reason = self._read_status()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:292: in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
../../../.pyenv/versions/3.12.1/lib/python3.12/socket.py:707: in readinto
    return self._sock.recv_into(b)
../../../.pyenv/versions/3.12.1/lib/python3.12/ssl.py:1253: in recv_into
    return self.read(nbytes, buffer)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=0>, len = 8192, buffer = <memory at 0x11ba80100>

    def read(self, len=1024, buffer=None):
        """Read up to LEN bytes and return them.
        Return zero-length string on EOF."""
    
        self._checkClosed()
        if self._sslobj is None:
            raise ValueError("Read on closed or unwrapped SSL socket.")
        try:
            if buffer is not None:
>               return self._sslobj.read(len, buffer)
E               TimeoutError: The read operation timed out

../../../.pyenv/versions/3.12.1/lib/python3.12/ssl.py:1105: TimeoutError

During handling of the above exception, another exception occurred:

self = <requests.adapters.HTTPAdapter object at 0x11bb29e20>, request = <PreparedRequest [GET]>, stream = False, timeout = Timeout(connect=0.01, read=0.01, total=None)
verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection(request.url, proxies)
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
>           resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/adapters.py:486: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:799: in urlopen
    retries = retries.increment(
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/util/retry.py:550: in increment
    raise six.reraise(type(error), error, _stacktrace)
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/packages/six.py:770: in reraise
    raise value
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:715: in urlopen
    httplib_response = self._make_request(
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:469: in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x11bd9d430>, err = TimeoutError('The read operation timed out')
url = '/dr17/en/tools/search/x_results.aspx?cmd=+SELECT%0D+p.ra%2C+p.dec%2C+p.objid%2C+p.run%2C+p.rerun%2C+p.camcol%2C+p.fie...8%28%28p.ra+BETWEEN+2.02319+AND+2.02374%29+AND+%28p.dec+BETWEEN+14.8395+AND+14.8401%29%29%29&format=csv&searchtool=SQL'
timeout_value = 0.01

    def _raise_timeout(self, err, url, timeout_value):
        """Is the error actually a timeout? Will raise a ReadTimeout or pass"""
    
        if isinstance(err, SocketTimeout):
>           raise ReadTimeoutError(
                self, url, "Read timed out. (read timeout=%s)" % timeout_value
            )
E           urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read timeout=0.01)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:358: ReadTimeoutError

During handling of the above exception, another exception occurred:

self = <astroquery.sdss.tests.test_sdss_remote.TestSDSSRemote object at 0x11bdcc980>

    def test_query_timeout(self):
        with pytest.raises(ConnectTimeout):
>           sdss.SDSS.query_region(self.coords, width=2.0 * u.arcsec, cache=False, timeout=self.mintimeout)

astroquery/sdss/tests/test_sdss_remote.py:187: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
astroquery/utils/class_or_instance.py:25: in f
    return self.fn(obj, *args, **kwds)
astroquery/utils/process_asyncs.py:26: in newmethod
    response = getattr(self, async_method_name)(*args, **kwargs)
astroquery/sdss/core.py:387: in query_region_async
    return self.query_sql_async(sql_query, timeout=timeout,
astroquery/sdss/core.py:622: in query_sql_async
    response = self._request("GET", url, params=request_payload,
astroquery/query.py:368: in _request
    response = query.request(self._session, stream=stream,
astroquery/query.py:77: in request
    return session.request(self.method, self.url, params=self.params,
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/sessions.py:589: in request
    resp = self.send(prep, **send_kwargs)
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x11bb29e20>, request = <PreparedRequest [GET]>, stream = False, timeout = Timeout(connect=0.01, read=0.01, total=None)
verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection(request.url, proxies)
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
    
        except (ProtocolError, OSError) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
    
            if isinstance(e.reason, _SSLError):
                # This branch is for urllib3 v1.22 and later.
                raise SSLError(e, request=request)
    
            raise ConnectionError(e, request=request)
    
        except ClosedPoolError as e:
            raise ConnectionError(e, request=request)
    
        except _ProxyError as e:
            raise ProxyError(e)
    
        except (_SSLError, _HTTPError) as e:
            if isinstance(e, _SSLError):
                # This branch is for urllib3 versions earlier than v1.22
                raise SSLError(e, request=request)
            elif isinstance(e, ReadTimeoutError):
>               raise ReadTimeout(e, request=request)
E               requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read timeout=0.01)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/adapters.py:532: ReadTimeout
========================================================================= short test summary info ==========================================================================
FAILED astroquery/sdss/tests/test_sdss_remote.py::TestSDSSRemote::test_query_timeout - requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read timeout=0.01)
====================================================================== 1 failed, 681 passed in 45.06s ======================================================================

CHANGES.rst Show resolved Hide resolved
astroquery/sdss/core.py Outdated Show resolved Hide resolved
astroquery/sdss/tests/test_sdss.py Show resolved Hide resolved
astroquery/sdss/tests/test_sdss.py Outdated Show resolved Hide resolved
astroquery/sdss/tests/test_sdss_remote.py Outdated Show resolved Hide resolved
@bsipocz
Copy link
Member

bsipocz commented Jun 12, 2024

Other things:

I've been informed that in DR19 there will be a unified way to access all SDSS/BOSS/eBOSS/eFEDS spectra in a single table, so I propose possibly documenting the matches workaround (I haven't done that yet) and waiting for DR19 before changing any queries.

Sounds totally reasonable to wait. Could you open a reminder issue for this though?

I can squash commits if necessary. There was a bit of churn getting tests to work.

Yes, it would be great to squash the iterations of tests/style a little bit.

@weaverba137
Copy link
Member Author

@bsipocz when you got the different exception above, could you describe what environment you were using? It's probably sufficient to just list the Python and requests version.

@bsipocz
Copy link
Member

bsipocz commented Jun 14, 2024

I see the ReadTimeout on py3.12 and request 2.31 but also when upgraded to the latest 2.32, but also can reproduce it with the tox environment oldestdeps. It really is consistent in all the envs I tried it in. Looking more into the lines, I think it'll be just the matter of importing TimeoutError from the right place (I haven't tried it though).

@weaverba137
Copy link
Member Author

I don't think we've quite converged on a consistent set of exceptions to use, but everything else should be done at this point.

@weaverba137
Copy link
Member Author

I used tox -e py310-test-online -- -P sdss to run the remote data tests, and found no errors using ConnectTimeout. The requests version was 2.32.3.

@bsipocz
Copy link
Member

bsipocz commented Jun 21, 2024

I still see the failure, even on the latest requests version

$ ~/munka/devel/astroquery [sdss-dr18-urls L|✚ 1…1⚑ 10] $ pytest -P sdss -R
================================================= test session starts ==================================================
platform darwin -- Python 3.12.1, pytest-8.2.2, pluggy-1.5.0

Running tests with astroquery version 0.4.8.dev9433_testrun_testrun.

Running tests with astropy_helpers version 4.0.1.
Running tests in astroquery docs.

Date: 2024-06-21T10:52:28

Platform: macOS-12.5.1-arm64-arm-64bit

Executable: /Users/bsipocz/.pyenv/versions/3.12.1/bin/python

Full Python Version: 
3.12.1 (main, Dec 22 2023, 11:59:08) [Clang 14.0.0 (clang-1400.0.29.202)]

encodings: sys: utf-8, locale: UTF-8, filesystem: utf-8
byteorder: little
float info: dig: 15, mant_dig: 15

Package versions: 
Numpy: 2.0.0
Matplotlib: 3.9.0
Astropy: 7.0.0.dev92+gecfcf0f0c7.d20240519
regions: 0.9
pyVO: 1.6.dev130+gd2d4efa
mocpy: 0.14.0
astropy-healpix: 1.0.3
vamdclib: not available

Using Astropy options: remote_data: any.

rootdir: /Users/bsipocz/munka/devel/astroquery
configfile: setup.cfg
testpaths: astroquery, docs
plugins: astropy-0.11.0, cov-5.0.0, remotedata-0.4.1, anyio-4.3.0, rerunfailures-14.0, filter-subpackage-0.2.0, doctestplus-1.2.1, astropy-header-0.2.2, dependency-0.6.0, hypothesis-6.100.5, asdf-3.2.0, arraydiff-0.6.1, nbval-0.11.0, mock-3.14.0, xdist-3.6.1, requests-mock-1.12.1
collected 682 items                                                                                                    

astroquery/sdss/tests/test_sdss.py ............................................................................. [ 11%]
................................................................................................................ [ 27%]
................................................................................................................ [ 44%]
................................................................................................................ [ 60%]
................................................................................................................ [ 76%]
..............................................................................................................   [ 93%]
astroquery/sdss/tests/test_sdss_remote.py ......................F.......................                         [ 99%]
docs/sdss/sdss.rst .                                                                                             [100%]

======================================================= FAILURES =======================================================
__________________________________________ TestSDSSRemote.test_query_timeout ___________________________________________

self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x12becb1d0>
conn = <urllib3.connection.HTTPSConnection object at 0x12b9e0350>, method = 'GET'
url = '/dr17/en/tools/search/x_results.aspx?cmd=+SELECT%0D+p.ra%2C+p.dec%2C+p.objid%2C+p.run%2C+p.rerun%2C+p.camcol%2C+p.fie...8%28%28p.ra+BETWEEN+2.02319+AND+2.02374%29+AND+%28p.dec+BETWEEN+14.8395+AND+14.8401%29%29%29&format=csv&searchtool=SQL'
timeout = Timeout(connect=0.01, read=0.01, total=None), chunked = False
httplib_request_kw = {'body': None, 'headers': {'User-Agent': 'astroquery/0.4.8.dev9433_testrun_testrun Python/3.12.1 (Darwin) python-reque...eUi7P00A; SERVERID=dsa003; ASPSESSIONIDSQTDAQSR=KFGGBOJCGDFCMEGMMKIMCKNM; ASP.NET_SessionId=bjsu10jlsdrlfpaeevp1bn0s'}}
timeout_obj = Timeout(connect=0.01, read=0.01, total=None), read_timeout = 0.01

    def _make_request(
        self, conn, method, url, timeout=_Default, chunked=False, **httplib_request_kw
    ):
        """
        Perform a request on a given urllib connection object taken from our
        pool.
    
        :param conn:
            a connection from one of our connection pools
    
        :param timeout:
            Socket timeout in seconds for the request. This can be a
            float or integer, which will set the same timeout value for
            the socket connect and the socket read, or an instance of
            :class:`urllib3.util.Timeout`, which gives you more fine-grained
            control over your timeouts.
        """
        self.num_requests += 1
    
        timeout_obj = self._get_timeout(timeout)
        timeout_obj.start_connect()
        conn.timeout = Timeout.resolve_default_timeout(timeout_obj.connect_timeout)
    
        # Trigger any extra validation we need to do.
        try:
            self._validate_conn(conn)
        except (SocketTimeout, BaseSSLError) as e:
            # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.
            self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
            raise
    
        # conn.request() calls http.client.*.request, not the method in
        # urllib3.request. It also calls makefile (recv) on the socket.
        try:
            if chunked:
                conn.request_chunked(method, url, **httplib_request_kw)
            else:
                conn.request(method, url, **httplib_request_kw)
    
        # We are swallowing BrokenPipeError (errno.EPIPE) since the server is
        # legitimately able to close the connection after sending a valid response.
        # With this behaviour, the received response is still readable.
        except BrokenPipeError:
            # Python 3
            pass
        except IOError as e:
            # Python 2 and macOS/Linux
            # EPIPE and ESHUTDOWN are BrokenPipeError on Python 2, and EPROTOTYPE is needed on macOS
            # https://erickt.github.io/blog/2014/11/19/adventures-in-debugging-a-potential-osx-kernel-bug/
            if e.errno not in {
                errno.EPIPE,
                errno.ESHUTDOWN,
                errno.EPROTOTYPE,
            }:
                raise
    
        # Reset the timeout for the recv() on the socket
        read_timeout = timeout_obj.read_timeout
    
        # App Engine doesn't have a sock attr
        if getattr(conn, "sock", None):
            # In Python 3 socket.py will catch EAGAIN and return None when you
            # try and read into the file pointer created by http.client, which
            # instead raises a BadStatusLine exception. Instead of catching
            # the exception and assuming all BadStatusLine exceptions are read
            # timeouts, check for a zero timeout before making the request.
            if read_timeout == 0:
                raise ReadTimeoutError(
                    self, url, "Read timed out. (read timeout=%s)" % read_timeout
                )
            if read_timeout is Timeout.DEFAULT_TIMEOUT:
                conn.sock.settimeout(socket.getdefaulttimeout())
            else:  # None or a value
                conn.sock.settimeout(read_timeout)
    
        # Receive the response from the server
        try:
            try:
                # Python 2.7, use buffering of HTTP responses
                httplib_response = conn.getresponse(buffering=True)
            except TypeError:
                # Python 3
                try:
                    httplib_response = conn.getresponse()
                except BaseException as e:
                    # Remove the TypeError from the exception chain in
                    # Python 3 (including for exceptions like SystemExit).
                    # Otherwise it looks like a bug in the code.
>                   six.raise_from(e, None)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:467: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:462: in _make_request
    httplib_response = conn.getresponse()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:1419: in getresponse
    response.begin()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:331: in begin
    version, status, reason = self._read_status()
../../../.pyenv/versions/3.12.1/lib/python3.12/http/client.py:292: in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
../../../.pyenv/versions/3.12.1/lib/python3.12/socket.py:707: in readinto
    return self._sock.recv_into(b)
../../../.pyenv/versions/3.12.1/lib/python3.12/ssl.py:1253: in recv_into
    return self.read(nbytes, buffer)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=0>, len = 8192, buffer = <memory at 0x12b3b9c00>

    def read(self, len=1024, buffer=None):
        """Read up to LEN bytes and return them.
        Return zero-length string on EOF."""
    
        self._checkClosed()
        if self._sslobj is None:
            raise ValueError("Read on closed or unwrapped SSL socket.")
        try:
            if buffer is not None:
>               return self._sslobj.read(len, buffer)
E               TimeoutError: The read operation timed out

../../../.pyenv/versions/3.12.1/lib/python3.12/ssl.py:1105: TimeoutError

During handling of the above exception, another exception occurred:

self = <requests.adapters.HTTPAdapter object at 0x12b8b4170>, request = <PreparedRequest [GET]>, stream = False
timeout = Timeout(connect=0.01, read=0.01, total=None), verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection_with_tls_context(
                request, verify, proxies=proxies, cert=cert
            )
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
>           resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/adapters.py:667: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:799: in urlopen
    retries = retries.increment(
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/util/retry.py:550: in increment
    raise six.reraise(type(error), error, _stacktrace)
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/packages/six.py:770: in reraise
    raise value
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:715: in urlopen
    httplib_response = self._make_request(
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:469: in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x12becb1d0>
err = TimeoutError('The read operation timed out')
url = '/dr17/en/tools/search/x_results.aspx?cmd=+SELECT%0D+p.ra%2C+p.dec%2C+p.objid%2C+p.run%2C+p.rerun%2C+p.camcol%2C+p.fie...8%28%28p.ra+BETWEEN+2.02319+AND+2.02374%29+AND+%28p.dec+BETWEEN+14.8395+AND+14.8401%29%29%29&format=csv&searchtool=SQL'
timeout_value = 0.01

    def _raise_timeout(self, err, url, timeout_value):
        """Is the error actually a timeout? Will raise a ReadTimeout or pass"""
    
        if isinstance(err, SocketTimeout):
>           raise ReadTimeoutError(
                self, url, "Read timed out. (read timeout=%s)" % timeout_value
            )
E           urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read timeout=0.01)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/urllib3/connectionpool.py:358: ReadTimeoutError

During handling of the above exception, another exception occurred:

self = <astroquery.sdss.tests.test_sdss_remote.TestSDSSRemote object at 0x12bbf3a10>

    def test_query_timeout(self):
        with pytest.raises(ConnectTimeout):
>           sdss.SDSS.query_region(self.coords, width=2.0 * u.arcsec, cache=False, timeout=self.mintimeout)

astroquery/sdss/tests/test_sdss_remote.py:187: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
astroquery/utils/class_or_instance.py:25: in f
    return self.fn(obj, *args, **kwds)
astroquery/utils/process_asyncs.py:26: in newmethod
    response = getattr(self, async_method_name)(*args, **kwargs)
astroquery/sdss/core.py:387: in query_region_async
    return self.query_sql_async(sql_query, timeout=timeout,
astroquery/sdss/core.py:622: in query_sql_async
    response = self._request("GET", url, params=request_payload,
astroquery/query.py:372: in _request
    response = query.request(self._session, stream=stream,
astroquery/query.py:77: in request
    return session.request(self.method, self.url, params=self.params,
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/sessions.py:589: in request
    resp = self.send(prep, **send_kwargs)
../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x12b8b4170>, request = <PreparedRequest [GET]>, stream = False
timeout = Timeout(connect=0.01, read=0.01, total=None), verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection_with_tls_context(
                request, verify, proxies=proxies, cert=cert
            )
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
    
        except (ProtocolError, OSError) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
    
            if isinstance(e.reason, _SSLError):
                # This branch is for urllib3 v1.22 and later.
                raise SSLError(e, request=request)
    
            raise ConnectionError(e, request=request)
    
        except ClosedPoolError as e:
            raise ConnectionError(e, request=request)
    
        except _ProxyError as e:
            raise ProxyError(e)
    
        except (_SSLError, _HTTPError) as e:
            if isinstance(e, _SSLError):
                # This branch is for urllib3 versions earlier than v1.22
                raise SSLError(e, request=request)
            elif isinstance(e, ReadTimeoutError):
>               raise ReadTimeout(e, request=request)
E               requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read timeout=0.01)

../../../.pyenv/versions/3.12.1/lib/python3.12/site-packages/requests/adapters.py:713: ReadTimeout
=============================================== short test summary info ================================================
FAILED astroquery/sdss/tests/test_sdss_remote.py::TestSDSSRemote::test_query_timeout - requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='skyserver.sdss.org', port=443): Read timed out. (read ti...
============================================ 1 failed, 681 passed in 46.53s ============================================

@weaverba137
Copy link
Member Author

@bsipocz I wonder if your network is so fast that you are seeing a timeout after a connection is established, while I am seeing a timeout before a connection is established.

In any case, I think the correct exception is actually:

class Timeout(RequestException):
    """The request timed out.

    Catching this error will catch both
    :exc:`~requests.exceptions.ConnectTimeout` and
    :exc:`~requests.exceptions.ReadTimeout` errors.
    """

@weaverba137
Copy link
Member Author

OK, please test with requests.exceptions.Timeout.

@bsipocz
Copy link
Member

bsipocz commented Jun 21, 2024

Now it all works. I don't think I have a particularly fast internet, but it may indeed routed differently than if you were testing from a university network. Anyway, the suggested solution is correct, we shouldn't really care what type of timeout one runs into.

@bsipocz
Copy link
Member

bsipocz commented Jun 21, 2024

Maybe a rebase to get rid of the merge and squash some of the back-and-forths?

@weaverba137
Copy link
Member Author

Great! You should be able to squash when you merge, according to what I've read recently about merging on GitHub.

@bsipocz
Copy link
Member

bsipocz commented Jun 21, 2024

We don't do squash merges here as it doesn't fit into the workflow, but I'm happy to do the rebase/squash prior merging.

@weaverba137
Copy link
Member Author

That's OK, I'll go ahead with the squash now. It's good practice for me.

@bsipocz
Copy link
Member

bsipocz commented Jun 21, 2024

That's OK, I'll go ahead with the squash now. It's good practice for me.

Bikeshed, but if possible don't squash down to a single commit, it was more work that that, as well as there were more logical chunks :)

@weaverba137
Copy link
Member Author

I fixed conflicts in CHANGES.rst 3 or 4 times. If you want to squash further, I'll leave that to you.

@bsipocz
Copy link
Member

bsipocz commented Jun 28, 2024

I fixed conflicts in CHANGES.rst 3 or 4 times. If you want to squash further, I'll leave that to you.

I'm not sure what went wrong, but the first commit here, that is the squashed everything, is clearly contains unrelated changes that have been picked up from main.
I'll go ahead and rebase it again, but you can go back and double check what it was when clicking on 10885b2 in the force push log above.

@bsipocz bsipocz merged commit 95c9216 into astropy:main Jun 28, 2024
8 checks passed
@bsipocz
Copy link
Member

bsipocz commented Jun 28, 2024

Thank you @weaverba137!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDSS DR18 implementation
3 participants