Improve command classification #3

Closed
opened 2022-04-06 18:06:54 +02:00 by Lephenixnoir · 6 comments
Owner

Currently commands are named semi-randomly, and I'm already feeling a lack of logic in the naming.

My plan is to re-classify somewhere along these lines:

  • a: Analysis functions
    Analysis means it performs some non-trivial computational task, probably on large amounts of code/data, and it saves the results in the virtual space/disassembly. Like the big analysis step when you first open a file in IDA for example.

    • ad: Disassemble functions named by hand
    • ads: Disassemble all syscalls
    • Other variants with other entry points, including bootcode, interrupt handlers and applications (why not)
  • d: Disassembly
    Shows disassembled code. Currently it uses a temporary disassembly rather than the main one which has all the analysis results, which should probably change by default once the analysis is rich enough.

  • f: Functions Commands that computes/shows/manipulates information about functions.

    • fcg: Call graph
    • fx: Cross-references
    • Give signatures, calling conventions, abstract-interpretation-derived info...
  • i: Info functions
    Commands that output information from fxos's data structures that was previously computed, with no new analysis, just sorting, filtering, etc.

    • io: Info OS
    • is: Probably info symbols, with filtering (instead of syscalls). Currently sl
    • isc: Info syscalls
    • Then info on functions, claims, datatypes, etc.
  • m: Metadata/annotation functions
    Commands to set small bits of informations about particular addresses or syscalls. Things like: defining symbols, assigning data types to values in memory. Would include the current sa and ss (probably unified)

  • p: Project functions
    For the hypothetical future time when I'll want to save analysis results on disk and reload them later.

  • s: Search functions

    • s4: Search 4-aligned value (currently af4)
    • sh: Search hexadecimal pattern (currently afh)
    • ss: Search in strings
    • ...
  • v: Virtual space functions
    Like currently, except that vct is kind of weird. I might remove it.

Not currently classified: g (currently useless), e, ev, h (probably worth generalizing into more advanced visualisations).

Still missing: cross-references for data (eg. registers or RAM addresses), everything related to comparisons of different OS versions.

I don't think I'll ever implement all of this, which means it's future-proof enough. x)

@Dr-Carlos Input welcome.

Currently commands are named semi-randomly, and I'm already feeling a lack of logic in the naming. My plan is to re-classify somewhere along these lines: * `a`: Analysis functions \ Analysis means it performs some non-trivial computational task, probably on large amounts of code/data, *and* it saves the results in the virtual space/disassembly. Like the big analysis step when you first open a file in IDA for example. - `ad`: Disassemble functions named by hand - `ads`: Disassemble all syscalls - Other variants with other entry points, including bootcode, interrupt handlers and applications (why not) * `d`: Disassembly \ Shows disassembled code. Currently it uses a temporary disassembly rather than the main one which has all the analysis results, which should probably change by default once the analysis is rich enough. * `f`: Functions Commands that computes/shows/manipulates information about functions. - `fcg`: Call graph - `fx`: Cross-references - Give signatures, calling conventions, abstract-interpretation-derived info... * `i`: Info functions \ Commands that output information from fxos's data structures that was previously computed, with no new analysis, just sorting, filtering, etc. - `io`: Info OS - `is`: Probably info **symbols**, with filtering (instead of syscalls). Currently `sl` - `isc`: Info syscalls - Then info on functions, claims, datatypes, etc. * `m`: Metadata/annotation functions \ Commands to set small bits of informations about particular addresses or syscalls. Things like: defining symbols, assigning data types to values in memory. Would include the current `sa` and `ss` (probably unified) * `p`: Project functions \ For the hypothetical future time when I'll want to save analysis results on disk and reload them later. * `s`: Search functions - `s4`: Search 4-aligned value (currently `af4`) - `sh`: Search hexadecimal pattern (currently `afh`) - `ss`: Search in strings - ... * `v`: Virtual space functions \ Like currently, except that `vct` is kind of weird. I might remove it. Not currently classified: `g` (currently useless), `e`, `ev`, `h` (probably worth generalizing into more advanced visualisations). Still missing: cross-references for data (eg. registers or RAM addresses), everything related to comparisons of different OS versions. I don't think I'll ever implement all of this, which means it's future-proof enough. x) @Dr-Carlos Input welcome.
Collaborator

I like the plan - it's good to have a long term vision even if the commands dont't all become implemented.

You might also want a 'misc' category - for g, e, ev, h (until they are moved into something else), and possibly even ..

I agree that vct is a bit weird - you could make it a part of vc, but I think it makes most sense to just do this with ..

The longer you wait to change things the more people will be affected, so you probably want to implement this sooner rather than later.

I like the plan - it's good to have a long term vision even if the commands dont't all become implemented. You might also want a 'misc' category - for `g`, `e`, `ev`, `h` (until they are moved into something else), and possibly even `.`. I agree that vct is a bit weird - you could make it a part of vc, but I think it makes most sense to just do this with `.`. The longer you wait to change things the more people will be affected, so you probably want to implement this sooner rather than later.
Collaborator

Also, dr and dtl. I assume dr stay the same, but I think dtl makes more sense in virtual space, as there is nothing actually being disassembled.

Also, `dr` and `dtl`. I assume `dr` stay the same, but I think `dtl` makes more sense in virtual space, as there is nothing actually being disassembled.
Author
Owner

Thanks for your feedback!

You might also want a 'misc' category - for g, e, ev, h (until they are moved into something else), and possibly even ..

Since the letter is the category I guess there's implicitly a category for each. :P

For ev I think I can put an option in e instead, something like e vspace=fx_2.05 Bdisp_PutDispDD. This seems clearer to me, especially since these commands have variadic arguments.

I agree that vct is a bit weird - you could make it a part of vc, but I think it makes most sense to just do this with ..

The original intent was that I didn't like the fact that vm etc. relied on the current virtual space, which is a global information and not visible when reading the script. But I guess it's not the worst; relying on the particular value of $ would be more questionable by comparison.

I think I'll just get rid of it and use vc and ., which are just much more natural.

The longer you wait to change things the more people will be affected, so you probably want to implement this sooner rather than later.

I don't think there'll ever be "many" users, but yes.

Also, dr and dtl. I assume dr stay the same, but I think dtl makes more sense in virtual space, as there is nothing actually being disassembled.

For dr I want to try to improve the parser so that d can accept both an address and a range, as the distinction is not functional but purely syntaxic.

For dtl, it doesn't belong to v either because the disassembly table is global, ie. it's not linked to the virtual space. Since this loads a file into fxos globally, I'm thinking maybe .dt would be a better pick.

Thanks for your feedback! > You might also want a 'misc' category - for `g`, `e`, `ev`, `h` (until they are moved into something else), and possibly even `.`. Since the letter *is* the category I guess there's implicitly a category for each. :P For `ev` I think I can put an option in `e` instead, something like `e vspace=fx_2.05 Bdisp_PutDispDD`. This seems clearer to me, especially since these commands have variadic arguments. > I agree that vct is a bit weird - you could make it a part of vc, but I think it makes most sense to just do this with `.`. The original intent was that I didn't like the fact that `vm` etc. relied on the current virtual space, which is a global information and not visible when reading the script. But I guess it's not the worst; relying on the particular value of `$` would be more questionable by comparison. I think I'll just get rid of it and use `vc` and `.`, which are just much more natural. > The longer you wait to change things the more people will be affected, so you probably want to implement this sooner rather than later. I don't think there'll ever be "many" users, but yes. > Also, `dr` and `dtl`. I assume `dr` stay the same, but I think `dtl` makes more sense in virtual space, as there is nothing actually being disassembled. For `dr` I want to try to improve the parser so that `d` can accept both an address and a range, as the distinction is not functional but purely syntaxic. For `dtl`, it doesn't belong to `v` either because the disassembly table is global, ie. it's not linked to the virtual space. Since this loads a file into fxos globally, I'm thinking maybe `.dt` would be a better pick.
Collaborator

Yep, they're all good points.

For dr I want to try to improve the parser so that d can accept both an address and a range, as the distinction is not functional but purely syntaxic.

For dtl, it doesn't belong to v either because the disassembly table is global, ie. it's not linked to the virtual space. Since this loads a file into fxos globally, I'm thinking maybe .dt would be a better pick.

Having the parser accept an address or a range would be good, not only for dr, but for ad and other commands which currently accept only single addresses.

It might be possible that future calculators would want completely different dissassembly tables, so it would make more sense as v - but given the current scope of the project, I agree that .dt makes mroe sense.

Yep, they're all good points. > For `dr` I want to try to improve the parser so that `d` can accept both an address and a range, as the distinction is not functional but purely syntaxic. > > For `dtl`, it doesn't belong to `v` either because the disassembly table is global, ie. it's not linked to the virtual space. Since this loads a file into fxos globally, I'm thinking maybe `.dt` would be a better pick. Having the parser accept an address or a range would be good, not only for `dr`, but for `ad` and other commands which currently accept only single addresses. It might be possible that future calculators would want completely different dissassembly tables, so it would make more sense as `v` - but given the current scope of the project, I agree that `.dt` makes mroe sense.
Author
Owner

Having the parser accept an address or a range would be good, not only for dr, but for ad and other commands which currently accept only single addresses.

Yes. For ad specifically I'm not sure a region would make sense, since ad explores functions starting at the specified addresses - it can't automatically find functions in a range and can only start at an entry point. But fi. h could do with single addresses sometimes, ic could use ranges very effectively, etc.

It might be possible that future calculators would want completely different dissassembly tables, so it would make more sense as v - but given the current scope of the project, I agree that .dt makes mroe sense.

Ok, I'll go for that. I considered per-vspace disassembly tables at some point, but the problem is that it would make comparing OS versions (ie. different versions) quite troublesome.

To be honest, I don't see myself working with entirely new types of calculators with incompatibles processors - that would be more years and energy than I think I have :)

I'm keeping this open in case further ideas pop up and until it's implemented, but I'll think of this question as settled. ^^

> Having the parser accept an address or a range would be good, not only for dr, but for ad and other commands which currently accept only single addresses. Yes. For `ad` specifically I'm not sure a region would make sense, since `ad` explores *functions* starting at the specified addresses - it can't automatically find functions in a range and can only start at an entry point. But fi. `h` could do with single addresses sometimes, `ic` could use ranges very effectively, etc. > It might be possible that future calculators would want completely different dissassembly tables, so it would make more sense as v - but given the current scope of the project, I agree that .dt makes mroe sense. Ok, I'll go for that. I considered per-vspace disassembly tables at some point, but the problem is that it would make comparing OS versions (ie. different versions) quite troublesome. To be honest, I don't see myself working with entirely new types of calculators with incompatibles processors - that would be more years and energy than I think I have :) I'm keeping this open in case further ideas pop up and until it's implemented, but I'll think of this question as settled. ^^
Author
Owner

Closed by #4.

Closed by #4.
Sign in to join this conversation.
No Label
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Lephenixnoir/fxos#3
No description provided.