Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work-around for hardlinks unpacking #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ony
Copy link
Contributor

@ony ony commented Dec 1, 2019

This fixes #4 (at least those that are exposed in test).

We should monitor libarchive/libarchive#1203 for a better solution than temporary amending archive entries.

P.S. Commit is done on top PR #2 to show results.

@ony ony force-pushed the fix/hardlinks-unpack branch from 367919e to a296301 Compare December 1, 2019 15:58
@vmchale
Copy link
Owner

vmchale commented Dec 2, 2019

This should be fixed by a small rewrite now pushed to master. In particular, tests now pass.

Thanks for getting to the bottom of this!

@ony
Copy link
Contributor Author

ony commented Dec 2, 2019

@vmchale, thank you for considering this PR. I actually have concerns about code like this:

preHardlink <- liftIO $ archiveEntryHardlink x  -- pointer to current hardlink
hardlink <- liftIO $ peekCString preHardlink
let hardlink' = fp </> hardlink
liftIO $ withCString hardlink' $ \hl ->
    archiveEntrySetHardlink x hl  -- not sure if preHardLink is still valid after this
ignore $ archiveReadExtract a x archiveExtractTime
liftIO $ archiveEntrySetPathname x preFile
liftIO $ archiveEntrySetHardlink x preHardlink

I thought with a296301 I made this code a bit safer and hack itself more isolated.

@ony
Copy link
Contributor Author

ony commented Dec 2, 2019

Valgrind proof for d2b34ac :

zsh% valgrind dist/build/libarchive-test/libarchive-test 
...
==11615== Invalid read of size 1
==11615==    at 0x4A243B8: archive_strncat (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A24F4C: archive_mstring_copy_mbs_len (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x49D9F37: archive_entry_set_hardlink (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x42F2E3: libarchivezm2zi1zi0zi0zmABccD2SB0Y6If1yoANPt6ZZ_CodecziArchiveziUnpack_zdwunpackEntriesFp_info (in /tmp/ws/libarchive/dist/build/libarchive-test/libarchive-test)
==11615==  Address 0x936e8f1 is 1 bytes inside a block of size 32 free'd
==11615==    at 0x4838A4B: realloc (in /usr/x86_64-pc-linux-gnu/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11615==    by 0x4A21636: archive_string_ensure (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A216AA: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A243CD: archive_strncat (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A24F4C: archive_mstring_copy_mbs_len (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x49D9F37: archive_entry_set_hardlink (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x42E783: ??? (in /tmp/ws/libarchive/dist/build/libarchive-test/libarchive-test)
==11615==  Block was alloc'd at
==11615==    at 0x483637F: malloc (in /usr/x86_64-pc-linux-gnu/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11615==    by 0x4838AB7: realloc (in /usr/x86_64-pc-linux-gnu/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11615==    by 0x4A21636: archive_string_ensure (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A216AA: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A24C02: archive_strncat_l (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A250E5: archive_mstring_copy_mbs_len_l (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x49DA0A7: _archive_entry_copy_hardlink_l (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A12B5E: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A135F8: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x4A13ED4: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x49E6BE8: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
==11615==    by 0x49E6CEE: ??? (in /usr/x86_64-pc-linux-gnu/lib/libarchive.so.13.4.0)
...

And for 97f80b1 :

/tmp/ws/libarchive
zsh% valgrind dist/build/libarchive-test/libarchive-test   
==12124== Memcheck, a memory error detector
==12124== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12124== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==12124== Command: dist/build/libarchive-test/libarchive-test
==12124== 
==12124== Warning: set address range perms: large range [0x4200000000, 0x14200100000) (noaccess)

roundtrip
  with symlinks
    packs/unpacks successfully without loss
    packs/unpacks on filesystem successfully without loss
  with hardlinks
    packs/unpacks successfully without loss
    packs/unpacks on filesystem successfully without loss
  with forward referenced hardlinks
    packs/unpacks successfully without loss
    re-ordering on unpack
      packs/unpacks on filesystem successfully without loss
        # PENDING: No reason given
  having entry without ownership
    packs/unpacks successfully without loss
      # PENDING: No reason given
  having entry without timestamp
    packs/unpacks successfully without loss
      # PENDING: No reason given

Finished in 0.4359 seconds
8 examples, 0 failures, 3 pending
==12124== 
==12124== HEAP SUMMARY:
==12124==     in use at exit: 112 bytes in 1 blocks
==12124==   total heap usage: 3,655 allocs, 3,654 frees, 2,348,779 bytes allocated
==12124== 
==12124== LEAK SUMMARY:
==12124==    definitely lost: 112 bytes in 1 blocks
==12124==    indirectly lost: 0 bytes in 0 blocks
==12124==      possibly lost: 0 bytes in 0 blocks
==12124==    still reachable: 0 bytes in 0 blocks
==12124==         suppressed: 0 bytes in 0 blocks
==12124== Rerun with --leak-check=full to see details of leaked memory
==12124== 
==12124== For lists of detected and suppressed errors, rerun with: -s
==12124== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
(pts/23/rxvt-unicode-256color) nikolay@ony [0]  2019-12-02 21:12:53

Please re-consider.

Note that restoring entries might not be very useful. We are not re-using them, I guess. And if we would then we need to ensure that it doesn't happen in separate threads. Ehhh.. Unfortunately Haskell doesn't have destructible input like in Mercury language.

This also fixes access to freed memory. See comments in vmchale#4
@ony ony force-pushed the fix/hardlinks-unpack branch from a296301 to c785a0b Compare December 5, 2019 06:07
@ony
Copy link
Contributor Author

ony commented Dec 5, 2019

@vmchale , I re-based my change that includes memory fix and adapted to style you were using in 7d8d639 (including removal of LambdaCase). Please let me know if you have concerns about it ( c785a0b ).

Thank you.

-- See https://github.com/libarchive/libarchive/issues/1203
withPrefix :: FilePath -> ArchiveEntryPtr -> IO a -> IO a
withPrefix fp x inner = do
file0 <- peekCString =<< archiveEntryPathname x
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean the pointer to the original archive entry pathname is forgotten? Thus, the original pathname will never be freed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declaration is const char * archive_entry_pathname(struct archive_entry *a). Thus ownership for returned pointer is not transferred to caller as in char *strdup(const char*). It still owned by archive_entry structure.

You can find similar code in readEntry:

Entry
        <$> (peekCString =<< archiveEntryPathname entry)
        <*> readContents a entry
        -- ...

Copy link
Owner

@vmchale vmchale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite fails with

Test suite libarchive-test: RUNNING...

roundtrip
  sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz) FAILED [1]

Failures:

  test/Spec.hs:26:23:
  1) roundtrip sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz)
       predicate failed on: Left ArchiveFatal

  To rerun use: --match "/roundtrip/sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz)/"

Randomized with seed 1103473752

Finished in 0.0004 seconds
1 example, 1 failure

on my computer.

You need to run make -j so that the full test suite runs :)

@ony
Copy link
Contributor Author

ony commented Dec 5, 2019

The test suite fails with

Test suite libarchive-test: RUNNING...

roundtrip
  sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz) FAILED [1]

Failures:

  test/Spec.hs:26:23:
  1) roundtrip sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz)
       predicate failed on: Left ArchiveFatal

  To rerun use: --match "/roundtrip/sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz)/"

Randomized with seed 1103473752

Finished in 0.0004 seconds
1 example, 1 failure

on my computer.

You need to run make -j so that the full test suite runs :)

The same happens on f3dba95 (current master) on my computer :)
Wait maybe I didn't re-built source properly. Looking into it.

@ony
Copy link
Contributor Author

ony commented Dec 5, 2019

@vmchale , I suspect you have corrupted tar files. Please check that output of:

/tmp/ws/libarchive
zsh% sha1sum test/data/*
065a63ddf8523abdbf42066af42251ec876e0ed7  test/data/ATS2-Postiats-0.3.13.tar
4aec46a857f4f681022d8a133e058a9d29c7b3ae  test/data/alsa-lib-1.1.9.tar
aa30577a1eef88f473dd5441b311d47bd3e57088  test/data/ghc-8.8.1-src.tar
911b0263895072ce042948fc5c183286b7b6a8cf  test/data/libarchive-1.0.5.1.tar
aab843981927fd34827174479e333117ea3b0569  test/data/llvm-9.0.0.src.tar

On c785a0b error doesn't appear after executing:

zsh% rm -rf dist*; cabal v1-configure --enable-tests && cabal v1-build --verbose=0 && cabal v1-test && cat dist/test/libarchive-2.1.0.1-libarchive-test.log 
Resolving dependencies...
[1 of 1] Compiling Main             ( dist/setup/setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
Configuring libarchive-2.1.0.1...

src/Codec/Archive/Foreign/Archive.chs:333:1: warning: [-Wunused-imports]
    The qualified import of ‘Codec.Archive.Types.Foreign’ is redundant
      except perhaps to import instances from ‘Codec.Archive.Types.Foreign’
    To import instances alone, use: import Codec.Archive.Types.Foreign()
    |
333 | import Foreign.C.String
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

src/Codec/Archive/Pack.hs:26:1: warning: [-Wunused-imports]
    The import of ‘CLLong’ from module ‘Foreign.C.Types’ is redundant
   |
26 | import           Foreign.C.Types           (CLLong (..), CLong (..))
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Preprocessing library for libarchive-2.1.0.1..
Building library for libarchive-2.1.0.1..
Preprocessing test suite 'libarchive-test' for libarchive-2.1.0.1..
Building test suite 'libarchive-test' for libarchive-2.1.0.1..
Running 1 test suites...
Test suite libarchive-test: RUNNING...
Test suite libarchive-test: PASS
Test suite logged to: dist/test/libarchive-2.1.0.1-libarchive-test.log
1 of 1 test suites (1 of 1 test cases) passed.
Test suite libarchive-test: RUNNING...

roundtrip
  sucessfully unpacks/packs (test/data/ATS2-Postiats-0.3.13.tar)
  sucessfully unpacks/packs (test/data/llvm-9.0.0.src.tar)
  sucessfully unpacks/packs (test/data/ghc-8.8.1-src.tar)
  sucessfully unpacks/packs (test/data/alsa-lib-1.1.9.tar)
  sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar)
  with symlinks
    packs/unpacks successfully without loss
    packs/unpacks on filesystem successfully without loss
  with hardlinks
    packs/unpacks successfully without loss
    issue#4
      packs/unpacks on filesystem successfully without loss
  with forward referenced hardlinks
    packs/unpacks successfully without loss
    re-ordering on unpack
      packs/unpacks on filesystem successfully without loss
        # PENDING: No reason given
  having entry without ownership
    packs/unpacks successfully without loss
      # PENDING: No reason given
  having entry without timestamp
    packs/unpacks successfully without loss
      # PENDING: No reason given

Finished in 3.1821 seconds
13 examples, 0 failures, 3 pending
Test suite libarchive-test: PASS
Test suite logged to: dist/test/libarchive-2.1.0.1-libarchive-test.log

P.S. For some reason my make is going crazy and always re-downloads files. I decided to run it once and without -j (parallel jobs).

@ony
Copy link
Contributor Author

ony commented Dec 5, 2019

Heh. Found it:

...
  1) roundtrip sucessfully unpacks/packs (test/data/libarchive-1.0.5.1.tar.gz)
...

Culprit is test/data/libarchive-1.0.5.1.tar.gz: notice .tar.gz suffix. This is compressed file.

@vmchale
Copy link
Owner

vmchale commented Dec 6, 2019

I'm getting the same valgrind results for 43c6ff4 and c785a0b?

In any case, I've pushed a simplified version.

@ony
Copy link
Contributor Author

ony commented Dec 6, 2019

@vmchale, thank you for fixing this issue. I confirm that now there is no reports of invalid memory access on my computer. I believe it is fine to close this PR.

I've noticed that you introduced CI. That's a very useful for these situations when people are having different results. Since this library interacts with C directly it might be useful to add valgrind check. After checking out github search, I think .github/workflows/haskell.yml can be extended with something like:

diff --git a/.github/workflows/haskell.yml b/.github/workflows/haskell.yml
index 69982f2..cfb5683 100644
--- a/.github/workflows/haskell.yml
+++ b/.github/workflows/haskell.yml
@@ -7,6 +7,8 @@ jobs:
         with:
           cabal-version: "3.0"
           ghc-version: "8.8.1"
+      - name: "Install valgrind"
+        run: "sudo apt-get install -y valgrind"
       - name: "Install libarchive"
         run: |
           wget https://www.libarchive.org/downloads/libarchive-3.4.0.tar.gz
@@ -27,8 +29,11 @@ jobs:
           make -j
       - name: Tests
         run: "cabal test"
+      - name: "Valgrind tests"
+        run: "valgrind dist/build/libarchive-test/libarchive-test"
       - name: Documentation
         run: "cabal haddock"
 name: "Haskell CI"
 on:
   - push
+  - pull_request

It seems that valgrind cabal v1-test doesn't expose issue since valgrind injects its library only in cabal and not into sub-processes run by cabal. Thus it is required to call it directly with tests binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unpackArchive/unpackToDirLazy uses current directory for referencing hardlinks
2 participants