I want to extract and process the metadata from PNG images and the first line of .safetensors files for LLM’s and LoRA’s. I could spend ages farting around with sed or awk but formats of files are constantly changing. I’d like a faster way to see a summary of training and a few other details when they are available.

    • huginn@feddit.it
      link
      fedilink
      arrow-up
      0
      ·
      18 days ago

      I have a very handy command in my .vimrc for this -

      command! JSON setlocal filetype=json | %!jq .

      Anytime I’m in a json file that isn’t formatted it’s as simple as typing :JSON to have it all sorted.

  • tiredofsametab@kbin.run
    link
    fedilink
    arrow-up
    0
    ·
    19 days ago

    Previously, I coded something in Rust real quick to spit out and manipulate some JSON, but it looks like the jq/yq below would work fine.

  • Diplomjodler@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    19 days ago

    Python is very good for working with JSON. Definitely will get you there faster than awk for anything not completely trivial.

  • ᗺark dor@infosec.pub
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    19 days ago

    Pipe to jless first to pick out targets then jq

    If it is a small file and I want to do edits then use Yq to send it to Yaml and back again

    Looking at whether duckdb is a better aporoach especially for querying, bulk transforms, python

  • CaptPretentious@lemmy.world
    cake
    link
    fedilink
    arrow-up
    0
    ·
    18 days ago

    Probably not popular opinion, but pwsh (powershell). It’s got a lot of tooling built in and means I don’t have to learn a different tool just because I’m in a different system.

  • Nibodhika@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    18 days ago

    A week ago I would have said jq, but just the other day I discovered nushell and have been loving it, if you deal with structured data often it’s way easier, just bear in mind it’s not POSIX compatible

  • palordrolap@kbin.run
    link
    fedilink
    arrow-up
    0
    ·
    18 days ago

    There are probably pre-written awk scripts out there that already do what you want, not that I know where they’d be.

    That said, you might be better off using one of the bigger but still fairly commonly installed languages. There’s bound to be things on PyPI (for Python) or CPAN (for Perl) that could be bolted together for example.

    If you’re really lucky there might even be something that covers your whole use-case, but I haven’t checked.