diff --git a/pep-0262.txt b/pep-0262.txt index 20ad391b2f9..ef1933d2fc4 100644 --- a/pep-0262.txt +++ b/pep-0262.txt @@ -5,329 +5,343 @@ Last-Modified: $Date$ Author: A.M. Kuchling Status: Deferred Type: Standards Track +Content-Type: text/x-rst Created: 08-Jul-2001 Post-History: 27-Mar-2002 Introduction +============ - This PEP describes a format for a database of the Python software - installed on a system. +This PEP describes a format for a database of the Python software +installed on a system. - (In this document, the term "distribution" is used to mean a set - of code that's developed and distributed together. A "distribution" - is the same as a Red Hat or Debian package, but the term "package" - already has a meaning in Python terminology, meaning "a directory - with an __init__.py file in it.") +(In this document, the term "distribution" is used to mean a set +of code that's developed and distributed together. A "distribution" +is the same as a Red Hat or Debian package, but the term "package" +already has a meaning in Python terminology, meaning "a directory +with an ``__init__.py`` file in it.") Requirements - - We need a way to figure out what distributions, and what versions of - those distributions, are installed on a system. We want to provide - features similar to CPAN, APT, or RPM. Required use cases that - should be supported are: - - * Is distribution X on a system? - * What version of distribution X is installed? - * Where can the new version of distribution X be found? (This can - be defined as either "a home page where the user can go and - find a download link", or "a place where a program can find - the newest version?" Both should probably be supported.) - * What files did distribution X put on my system? - * What distribution did the file x/y/z.py come from? - * Has anyone modified x/y/z.py locally? - * What other distributions does this software need? - * What Python modules does this distribution provide? +============ + +We need a way to figure out what distributions, and what versions of +those distributions, are installed on a system. We want to provide +features similar to CPAN, APT, or RPM. Required use cases that +should be supported are: + +* Is distribution X on a system? +* What version of distribution X is installed? +* Where can the new version of distribution X be found? (This can + be defined as either "a home page where the user can go and + find a download link", or "a place where a program can find + the newest version?" Both should probably be supported.) +* What files did distribution X put on my system? +* What distribution did the file x/y/z.py come from? +* Has anyone modified x/y/z.py locally? +* What other distributions does this software need? +* What Python modules does this distribution provide? Database Location +================= - The database lives in a bunch of files under - /lib/python/install-db/. This location will be - called INSTALLDB through the remainder of this PEP. +The database lives in a bunch of files under +/lib/python/install-db/. This location will be +called INSTALLDB through the remainder of this PEP. - The structure of the database is deliberately kept simple; each - file in this directory or its subdirectories (if any) describes a - single distribution. Binary packagings of Python software such as - RPMs can then update Python's database by just installing the - corresponding file into the INSTALLDB directory. +The structure of the database is deliberately kept simple; each +file in this directory or its subdirectories (if any) describes a +single distribution. Binary packagings of Python software such as +RPMs can then update Python's database by just installing the +corresponding file into the INSTALLDB directory. - The rationale for scanning subdirectories is that we can move to a - directory-based indexing scheme if the database directory contains - too many entries. For example, this would let us transparently - switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some - similar hashing scheme. +The rationale for scanning subdirectories is that we can move to a +directory-based indexing scheme if the database directory contains +too many entries. For example, this would let us transparently +switch from INSTALLDB/Numeric to INSTALLDB/N/Nu/Numeric or some +similar hashing scheme. Database Contents +================= - Each file in INSTALLDB or its subdirectories describes a single - distribution, and has the following contents: +Each file in INSTALLDB or its subdirectories describes a single +distribution, and has the following contents: - An initial line listing the sections in this file, separated - by whitespace. Currently this will always be 'PKG-INFO FILES - REQUIRES PROVIDES'. This is for future-proofing; if we add a - new section, for example to list documentation files, then - we'd add a DOCS section and list it in the contents. Sections - are always separated by blank lines. +An initial line listing the sections in this file, separated +by whitespace. Currently this will always be 'PKG-INFO FILES +REQUIRES PROVIDES'. This is for future-proofing; if we add a +new section, for example to list documentation files, then +we'd add a DOCS section and list it in the contents. Sections +are always separated by blank lines. - A distribution that uses the Distutils for installation should - automatically update the database. Distributions that roll their - own installation will have to use the database's API to - manually add or update their own entry. System package managers - such as RPM or pkgadd can just create the new file in the - INSTALLDB directory. +A distribution that uses the Distutils for installation should +automatically update the database. Distributions that roll their +own installation will have to use the database's API to +manually add or update their own entry. System package managers +such as RPM or pkgadd can just create the new file in the +INSTALLDB directory. - Each section of the file is used for a different purpose. +Each section of the file is used for a different purpose. - PKG-INFO section +PKG-INFO section +---------------- - An initial set of RFC-822 headers containing the distribution - information for a file, as described in PEP 241, "Metadata for - Python Software Packages". +An initial set of RFC-822 headers containing the distribution +information for a file, as described in PEP 241, "Metadata for +Python Software Packages". - FILES section +FILES section +------------- - An entry for each file installed by the - distribution. Generated files such as .pyc and .pyo files are - on this list as well as the original .py files installed by a - distribution; their checksums won't be stored or checked, - though. +An entry for each file installed by the +distribution. Generated files such as .pyc and .pyo files are +on this list as well as the original .py files installed by a +distribution; their checksums won't be stored or checked, +though. - Each file's entry is a single tab-delimited line that contains - the following fields: +Each file's entry is a single tab-delimited line that contains +the following fields: - * The file's full path, as installed on the system. +* The file's full path, as installed on the system. - * The file's size +* The file's size - * The file's permissions. On Windows, this field will always be - 'unknown' +* The file's permissions. On Windows, this field will always be + 'unknown' - * The owner and group of the file, separated by a tab. - On Windows, these fields will both be 'unknown'. +* The owner and group of the file, separated by a tab. + On Windows, these fields will both be 'unknown'. - * A SHA1 digest of the file, encoded in hex. For generated files - such as *.pyc files, this field must contain the string "-", - which indicates that the file's checksum should not be verified. +* A SHA1 digest of the file, encoded in hex. For generated files + such as \*.pyc files, this field must contain the string "-", + which indicates that the file's checksum should not be verified. - REQUIRES section +REQUIRES section +---------------- - This section is a list of strings giving the services required for - this module distribution to run properly. This list includes the - distribution name ("python-stdlib") and module names ("rfc822", - "htmllib", "email", "email.Charset"). It will be specified - by an extra 'requires' argument to the distutils.core.setup() - function. For example: +This section is a list of strings giving the services required for +this module distribution to run properly. This list includes the +distribution name ("python-stdlib") and module names ("rfc822", +"htmllib", "email", "email.Charset"). It will be specified +by an extra 'requires' argument to the ``distutils.core.setup()`` +function. For example:: - setup(..., requires=['xml.utils.iso8601', + setup(..., requires=['xml.utils.iso8601', - Eventually there may be automated tools that look through all of - the code and produce a list of requirements, but it's unlikely - that these tools can handle all possible cases; a manual - way to specify requirements will always be necessary. +Eventually there may be automated tools that look through all of +the code and produce a list of requirements, but it's unlikely +that these tools can handle all possible cases; a manual +way to specify requirements will always be necessary. - PROVIDES section +PROVIDES section +---------------- - This section is a list of strings giving the services provided by - an installed distribution. This list includes the distribution name - ("python-stdlib") and module names ("rfc822", "htmllib", "email", - "email.Charset"). +This section is a list of strings giving the services provided by +an installed distribution. This list includes the distribution name +("python-stdlib") and module names ("rfc822", "htmllib", "email", +"email.Charset"). - XXX should files be listed? e.g. $PREFIX/lib/color-table.txt, - to pick up data files, required scripts, etc. +XXX should files be listed? e.g. $PREFIX/lib/color-table.txt, +to pick up data files, required scripts, etc. - Eventually there may be an option to let module developers add - their own strings to this section. For example, you might add - "XML parser" to this section, and other module distributions could - then list "XML parser" as one of their dependencies to indicate - that multiple different XML parsers can be used. For now this - ability isn't supported because it raises too many issues: do we - need a central registry of legal strings, or just let people put - whatever they like? Etc., etc... +Eventually there may be an option to let module developers add +their own strings to this section. For example, you might add +"XML parser" to this section, and other module distributions could +then list "XML parser" as one of their dependencies to indicate +that multiple different XML parsers can be used. For now this +ability isn't supported because it raises too many issues: do we +need a central registry of legal strings, or just let people put +whatever they like? Etc., etc... API Description - - There's a single fundamental class, InstallationDatabase. The - code for it lives in distutils/install_db.py. (XXX any - suggestions for alternate locations in the standard library, or an - alternate module name?) - - The InstallationDatabase returns instances of Distribution that contain - all the information about an installed distribution. - - XXX Several of the fields in Distribution are duplicates of ones in - distutils.dist.Distribution. Probably they should be factored out - into the Distribution class proposed here, but can this be done in a - backward-compatible way? - - InstallationDatabase has the following interface: - -class InstallationDatabase: - def __init__ (self, path=None): - """InstallationDatabase(path:string) - Read the installation database rooted at the specified path. - If path is None, INSTALLDB is used as the default. +=============== + +There's a single fundamental class, InstallationDatabase. The +code for it lives in distutils/install_db.py. (XXX any +suggestions for alternate locations in the standard library, or an +alternate module name?) + +The InstallationDatabase returns instances of Distribution that contain +all the information about an installed distribution. + +XXX Several of the fields in Distribution are duplicates of ones in +distutils.dist.Distribution. Probably they should be factored out +into the Distribution class proposed here, but can this be done in a +backward-compatible way? + +InstallationDatabase has the following interface:: + + class InstallationDatabase: + def __init__ (self, path=None): + """InstallationDatabase(path:string) + Read the installation database rooted at the specified path. + If path is None, INSTALLDB is used as the default. + """ + + def get_distribution (self, distribution_name): + """get_distribution(distribution_name:string) : Distribution + Get the object corresponding to a single distribution. + """ + + def list_distributions (self): + """list_distributions() : [Distribution] + Return a list of all distributions installed on the system, + enumerated in no particular order. + """ + + def find_distribution (self, path): + """find_file(path:string) : Distribution + Search and return the distribution containing the file 'path'. + Returns None if the file doesn't belong to any distribution + that the InstallationDatabase knows about. + XXX should this work for directories? + """ + + class Distribution: + """Instance attributes: + name : string + Distribution name + files : {string : (size:int, perms:int, owner:string, group:string, + digest:string)} + Dictionary mapping the path of a file installed by this distribution + to information about the file. + + The following fields all come from PEP 241. + + version : distutils.version.Version + Version of this distribution + platform : [string] + summary : string + description : string + keywords : string + home_page : string + author : string + author_email : string + license : string """ - def get_distribution (self, distribution_name): - """get_distribution(distribution_name:string) : Distribution - Get the object corresponding to a single distribution. - """ - - def list_distributions (self): - """list_distributions() : [Distribution] - Return a list of all distributions installed on the system, - enumerated in no particular order. - """ - - def find_distribution (self, path): - """find_file(path:string) : Distribution - Search and return the distribution containing the file 'path'. - Returns None if the file doesn't belong to any distribution - that the InstallationDatabase knows about. - XXX should this work for directories? - """ - -class Distribution: - """Instance attributes: - name : string - Distribution name - files : {string : (size:int, perms:int, owner:string, group:string, - digest:string)} - Dictionary mapping the path of a file installed by this distribution - to information about the file. - - The following fields all come from PEP 241. - - version : distutils.version.Version - Version of this distribution - platform : [string] - summary : string - description : string - keywords : string - home_page : string - author : string - author_email : string - license : string - """ - - def add_file (self, path): - """add_file(path:string):None - Record the size, ownership, &c., information for an installed file. - XXX as written, this would stat() the file. Should the size/perms/ - checksum all be provided as parameters to this method instead? - """ - - def has_file (self, path): - """has_file(path:string) : Boolean - Returns true if the specified path belongs to a file in this - distribution. - """ - - def check_file (self, path): - """check_file(path:string) : Boolean - Checks whether the file's size, checksum, and ownership match, - returning true if they do. - """ - - + def add_file (self, path): + """add_file(path:string):None + Record the size, ownership, &c., information for an installed file. + XXX as written, this would stat() the file. Should the size/perms/ + checksum all be provided as parameters to this method instead? + """ + + def has_file (self, path): + """has_file(path:string) : Boolean + Returns true if the specified path belongs to a file in this + distribution. + """ + + def check_file (self, path): + """check_file(path:string) : Boolean + Checks whether the file's size, checksum, and ownership match, + returning true if they do. + """ Deliverables +============ - A description of the database API, to be added to this PEP. +A description of the database API, to be added to this PEP. - Patches to the Distutils that 1) implement an InstallationDatabase - class, 2) Update the database when a new distribution is installed. 3) - add a simple package management tool, features to be added to this - PEP. (Or should that be a separate PEP?) See [2] for the current - patch. +Patches to the Distutils that 1) implement an InstallationDatabase +class, 2) Update the database when a new distribution is installed. 3) +add a simple package management tool, features to be added to this +PEP. (Or should that be a separate PEP?) See [2]_ for the current +patch. Open Issues +=========== - PJE suggests the installation database "be potentially present on - every directory in sys.path, with the contents merged in sys.path - order. This would allow home-directory or other - alternate-location installs to work, and ease the process of a - distutils install command writing the file." Nice feature: it does - mean that package manager tools can take into account Python - packages that a user has privately installed. +PJE suggests the installation database "be potentially present on +every directory in sys.path, with the contents merged in sys.path +order. This would allow home-directory or other +alternate-location installs to work, and ease the process of a +distutils install command writing the file." Nice feature: it does +mean that package manager tools can take into account Python +packages that a user has privately installed. - AMK wonders: what does setup.py do if it's told to install - packages to a directory not on sys.path? Does it write an - install-db directory to the directory it's told to write to, or - does it do nothing? +AMK wonders: what does setup.py do if it's told to install +packages to a directory not on sys.path? Does it write an +install-db directory to the directory it's told to write to, or +does it do nothing? - Should the package-database file itself be included in the files - list? (PJE would think yes, but of course it can't contain its - own checksum. AMK can't think of a use case where including the - DB file matters.) +Should the package-database file itself be included in the files +list? (PJE would think yes, but of course it can't contain its +own checksum. AMK can't think of a use case where including the +DB file matters.) - PJE wonders about writing the package DB file - *first*, before installing any other files, so that failed partial - installations can both be backed out, and recognized as broken. - This PEP may have to specify some algorithm for recognizing this - situation. +PJE wonders about writing the package DB file +**first**, before installing any other files, so that failed partial +installations can both be backed out, and recognized as broken. +This PEP may have to specify some algorithm for recognizing this +situation. - Should we guarantee the format of installation databases remains - compatible across Python versions, or is it subject to arbitrary - change? Probably we need to guarantee compatibility. +Should we guarantee the format of installation databases remains +compatible across Python versions, or is it subject to arbitrary +change? Probably we need to guarantee compatibility. Rejected Suggestions - - Instead of using one text file per distribution, one large text - file or an anydbm file could be used. This has been rejected for - a few reasons. First, performance is probably not an extremely - pressing concern as the database is only used when installing or - removing software, a relatively infrequent task. Scalability also - likely isn't a problem, as people may have hundreds of Python - packages installed, but thousands or tens of thousands seems - unlikely. Finally, individual text files are compatible with - installers such as RPM or DPKG because a binary packager can just - drop the new database file into the database directory. If one - large text file or a binary file were used, the Python database - would then have to be updated by running a postinstall script. - - On Windows, the permissions and owner/group of a file aren't - stored. Windows does in fact support ownership and access - permissions, but reading and setting them requires the win32all - extensions, and they aren't present in the basic Python installer - for Windows. +==================== + +Instead of using one text file per distribution, one large text +file or an anydbm file could be used. This has been rejected for +a few reasons. First, performance is probably not an extremely +pressing concern as the database is only used when installing or +removing software, a relatively infrequent task. Scalability also +likely isn't a problem, as people may have hundreds of Python +packages installed, but thousands or tens of thousands seems +unlikely. Finally, individual text files are compatible with +installers such as RPM or DPKG because a binary packager can just +drop the new database file into the database directory. If one +large text file or a binary file were used, the Python database +would then have to be updated by running a postinstall script. + +On Windows, the permissions and owner/group of a file aren't +stored. Windows does in fact support ownership and access +permissions, but reading and setting them requires the win32all +extensions, and they aren't present in the basic Python installer +for Windows. References +========== - [1] Michael Muller's patch (posted to the Distutils-SIG around 28 - Dec 1999) generates a list of installed files. +.. [1] Michael Muller's patch (posted to the Distutils-SIG around 28 + Dec 1999) generates a list of installed files. - [2] A patch to implement this PEP will be tracked as - patch #562100 on SourceForge. - http://www.python.org/sf/562100 . - Code implementing the installation database is currently in - Python CVS in the nondist/sandbox/pep262 directory. +.. [2] A patch to implement this PEP will be tracked as + patch #562100 on SourceForge. + http://www.python.org/sf/562100 . + Code implementing the installation database is currently in + Python CVS in the nondist/sandbox/pep262 directory. Acknowledgements +================ - Ideas for this PEP originally came from postings by Greg Ward, - Fred L. Drake Jr., Thomas Heller, Mats Wichmann, Phillip J. Eby, - and others. +Ideas for this PEP originally came from postings by Greg Ward, +Fred L. Drake Jr., Thomas Heller, Mats Wichmann, Phillip J. Eby, +and others. - Many changes and rewrites to this document were suggested by the - readers of the Distutils SIG. +Many changes and rewrites to this document were suggested by the +readers of the Distutils SIG. Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. - -Local Variables: -mode: indented-text -indent-tabs-mode: nil -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + End: