Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit undeclared marker fields #5281

Closed
mstange opened this issue Dec 30, 2024 · 5 comments
Closed

Audit undeclared marker fields #5281

mstange opened this issue Dec 30, 2024 · 5 comments
Assignees

Comments

@mstange
Copy link
Contributor

mstange commented Dec 30, 2024

Profiles with lots of markers are too big.

A lot of space in the stringified JSON is taken up by the field keys in each marker's data object. Example:
{ ​​"type": "Task", "name": "Document::NotifyIntersectionObservers", "priority": 4, "priorityName": "Normal", "task": 3944 }

It would be nice if we didn't have to repeat those string keys ("name", "priority" etc) for every marker.

#4494 proposes moving the "type": "Task" part outside the data object.

Once that's done, we could change the data object into a data array:
["Document::NotifyIntersectionObservers", 4, "Normal", 3944]

In order to know which value belongs to which field, we would check the marker schema.

However, this can only work if all fields are declared in the schema.

What are the cases in which markers have fields which are not declared in the schema? I'll use this issue to research and write down the answer to this question.

┆Issue is synchronized with this Jira Task

@mstange mstange self-assigned this Dec 30, 2024
@mstange
Copy link
Contributor Author

mstange commented Dec 30, 2024

I identified the following classes of markers with undeclared fields:

  • Markers with custom handling in the front-end (Network, GCSlice, GCMinor, GCMajor, IPC)
  • Old markers which weren't cleaned up during upgrading (tracing, DOMEvent, Styles, FileIO, Text)
  • Markers with fancy labels / tooltips (e.g. Timer, WindowProc)
  • Markers with marker graphs (e.g. CPU)
  • Markers with hopes for new front-end capabilities (Url + ChannelMarker with channelId)
  • Unintentional omissions (old UserTiming markers, CC markers before bug 1869835)
  • Imported Chrome traces

There are also some fields with special handling which are respected on all marker types: type, cause, innerWindowID, and threadId. These should probably all be stored outside the data object.

Markers with custom handling in the front-end

Network markers have a whole lot of fields that aren't mentioned in any schema: startTime, endTime, requestMethod, id, status, cache, pri, URI, RedirectURI, count, domainLookupStart, domainLookupEnd, connectStart, tcpConnectEnd, connectEnd, httpVersion, redirectType, classOfService, isHttpToHttpsRedirect, redirectId, isPrivateBrowsing
Examples: https://share.firefox.dev/3VXUhgQ, https://share.firefox.dev/49XSCxQ, https://share.firefox.dev/4gReE7q

GCSlice, GCMinor, GCMajor markers have custom code for their tooltips.
Their timings / nursery / nurseryTimings fields are not mentioned in any schema.
Examples: https://share.firefox.dev/49ZhJQM , https://share.firefox.dev/41QdR2z

IPC markers are handled with custom code, too. Even modern versions of Firefox don't emit a schema for IPC markers: https://share.firefox.dev/4fAkzwQ
But upgraded pre-schema profiles do get an (incomplete) IPC schema from the v33 upgrader.

CompositorScreenshot markers have no schema, and their url, windowID, windowWidth and windowHeight are handled with custom code. https://share.firefox.dev/41XjYSG

Old markers which weren't cleaned during upgrading

Profile with old tracing markers which still have an interval field: https://share.firefox.dev/4grktJ7
Profile with old DOMEvent markers which still have category and phase fields: https://share.firefox.dev/4iUTuY6
Profile with old Styles markers which still have a category field: https://share.firefox.dev/49UGfCJ
Profile with old MainThreadLongTask markers which still have a category field: https://share.firefox.dev/4fEwDNu
Profile with old FileIO marker where upgrading stack to cause failed: https://share.firefox.dev/40e6sss
Profile with old Text marker where upgrading stack to cause failed: https://share.firefox.dev/3PhMp69

Profile with IPC markers which gets its IPC schema from the v33 upgrader, and this schema doesn't mention the startTime, endTime, otherPid, messageSeqno, side, and direction fields: https://share.firefox.dev/40q3zoR

(Profiles with IPC markers which don't go through the v33 upgrader don't have an IPC schema at all! See the previous section.)

Markers with fancy labels / tooltips

Timer markers have a prefix field which is only used in the label: https://share.firefox.dev/3DLdrQQ

WindowProc markers have name and messageLoop fields which are only used in the label: https://share.firefox.dev/400A3V9

The Phase markers in this profile have a phase field which is only used in the label but not listed in the fields separately: https://share.firefox.dev/3W2PriG

The ProcessingTime, ProcessEnergy and WakeUpCount markers (with names "Process CPU Time", "Process Energy" and "Thread Wake-ups") in this profile have a label field which is only used in the labels: https://share.firefox.dev/4gX2yd7

DebugBreak markers have a name field that's only used in the label: https://share.firefox.dev/3W1qAfb

Markers with marker graphs

Example: https://share.firefox.dev/400DXgC
This profile has CPU markers where each marker has string fields and number fields; the string fields are used in the tooltip and are mentioned in the marker schema's data list, and the number fields are used for the graph and are mentioned in the marker schema's graphs list.

Markers with hopes for new front-end capabilities

Url / ChannelMarker markers (from D200055) have a channelId field which hasn't been used yet, and isn't shown in the tooltip: https://share.firefox.dev/40fEaOz

Unintentional omissions

The UserTiming measure markers in this profile don't show the startMark / endMark fields in the tooltip: https://share.firefox.dev/4a05eof
This has been fixed in the meantime, I think.

The CC marker in this profile doesn't show the "JS Zones freed" in the tooltip - the mFreedJSZones field is ignored: https://share.firefox.dev/3PeU850
This was a bug in CCGCScheduler.cpp which has since been fixed; the bug was introduced in bug 1735434 and fixed in bug 1869835.

Imported Chrome traces

Imported profile with CompleteTraceEvent and InstantTraceEvent marker types: https://share.firefox.dev/4gB9P2D
Imported profile with data.type set to the marker name: https://share.firefox.dev/4iPMqfc

Useless data objects

The following markers get a special shout out because they have data objects which are fully useless, because they only have a type property, and no other fields, and there is no schema for their type:

@fqueze
Copy link
Contributor

fqueze commented Dec 30, 2024

* Markers with hopes for new front-end capabilities (Url + ChannelMarker with channelId, ThreadCpuUse / AddRemoveTimer / Timer with threadId)

Isn't there already a custom front-end handling in the tooltip for the threadId field?

@mstange
Copy link
Contributor Author

mstange commented Dec 31, 2024

Indeed! I wasn't aware of this.

if (data && data.threadId !== undefined) {
// This marker has some threadId data in its payload, which is about the
// thread where this event happened.
const threadId = data.threadId;
const occurringThreadName = threadIdToNameMap.get(threadId);

I will edit my comment to correct this. I also left out some other details which I will add.

@mstange
Copy link
Contributor Author

mstange commented Dec 31, 2024

I've filled in the missing details. This investigation is complete.

@mstange mstange closed this as completed Dec 31, 2024
@mstange
Copy link
Contributor Author

mstange commented Dec 31, 2024

I used the following checker function (see #5274), applied to a local tree which already had some other format changes and upgraders:

function _isKnownUndeclaredSchema(
  schemaName: string,
  _markerName: string
): boolean {
  switch (schemaName) {
    case 'Network':
    case 'VsyncTimestamp':
    case 'IPC':
    case 'CompositorScreenshot':
    case 'MainThreadLongTask':
    case 'CONTENT_FULL_PAINT_TIME':
    case 'CONTENT_FRAME_TIME':
    case 'BHR-detected hang':
    case 'CompleteTraceEvent':
      return true;
    default:
      return false;
  }
}

function _isKnownUndeclaredField(
  schemaName: string,
  fieldKey: string
): boolean {
  if (fieldKey === 'docShellId') { // 17a65e573bf902ec4fd8f68b2098b56c046c70d7
    return ['DOMEvent', 'Styles', 'tracing', 'Text'].includes(schemaName);
  }

  if (fieldKey === 'stack') { // 1y01189smkgcmh64p7x4r7j4qmtddhrge5f849g
    return ['FileIO', 'Text'].includes(schemaName);
  }

  switch (schemaName) {
    case 'tracing':
    case 'tracingCCFrom52Upgrader':
      return fieldKey === 'interval';
    case 'Network':
      return [
        'startTime',
        'endTime',
        'secureConnectionStart',
        'requestMethod',
        'requestStart',
        'responseStart',
        'responseEnd',
        'id',
        'status',
        'cache',
        'pri',
        'URI',
        'RedirectURI',
        'count',
        'domainLookupStart',
        'domainLookupEnd',
        'connectStart',
        'tcpConnectEnd',
        'connectEnd',
        'contentType',
        'httpVersion', // missing schema?
        'redirectType', // missing schema? 094cy56cgcc4q0602h768zy6zbrbzg7nn959bp8
        'classOfService', // missing schema?
        'isHttpToHttpsRedirect', // missing schema? 16x915xng64f9ac3heeykzp71af4am78rzf8vt0
        'redirectId', // missing schema? 16x915xng64f9ac3heeykzp71af4am78rzf8vt0
        'isPrivateBrowsing', // 66xtsa0cwv80zg3h3qxtgf5e7txaft80ew7crng
      ].includes(fieldKey);
    case 'Url':
    case 'ChannelMarker':
      return fieldKey === 'channelId';
    case 'GCMinor':
      return fieldKey === 'nursery' || fieldKey === 'nurseryTimings';
    case 'GCSlice':
    case 'GCMajor':
      return fieldKey === 'timings';
    case 'CC':
      return fieldKey === 'mFreedJSZones';
    case 'ProcessingTime':
    case 'ProcessEnergy':
    case 'WakeUpCount':
      return fieldKey === 'label';
    case 'ThreadCpuUse':
      return fieldKey === 'label';
    case 'Timer':
      return fieldKey === 'prefix';
    case 'DOMEvent':
      return (
        fieldKey === 'category' ||
        fieldKey === 'phase' ||
        fieldKey === 'interval'
      );
    case 'WindowProc':
      return fieldKey === 'messageLoop' || fieldKey === 'name';
    case 'CPU':
      return true;
    case 'Phase':
      return fieldKey === 'phase';
    case 'ContextSwitch':
      return fieldKey === 'name';
    case 'DebugBreak':
      return fieldKey === 'name';
    case 'IPC':
      return [
        'startTime',
        'endTime',
        'otherPid',
        'messageSeqno',
        'side',
        'direction',
        'messageType', // missing schema? 04zbwayx368gzwhta58f8af15ne7vrd2d44g0vg
        'phase', // missing schema?
        'sync', // missing schema?
      ].includes(fieldKey);
    case 'Styles':
      return fieldKey === 'category';
    case 'UserTiming':
      return fieldKey === 'name' || fieldKey === 'startMark' || fieldKey === 'endMark';
    case 'CPUSample':
      return true;
    case 'CompositorScreenshot':
      return ['url', 'windowID', 'windowWidth', 'windowHeight'].includes(fieldKey); // 16x915xng64f9ac3heeykzp71af4am78rzf8vt0
    case 'MainThreadLongTask':
      return fieldKey === 'category';
    default:
      return false;
  }
}

function checkProfileUndeclaredMarkerFields(
  profile: any,
  _hash: string
): Set<string> {
  const { meta, threads } = profile;
  const { markerSchema } = meta;
  const markerSchemaByName = new Map(markerSchema.map((s) => [s.name, s]));

  // const namesForEmptyDataWithoutType = new Set();
  // const namesForNonEmptyDataWithoutType = new Set();
  const namesForUnknownTypes = new Set();
  const unknownTypes = new Set();
  const undeclaredFieldKeys = new Set();
  const namesForUndeclaredFields = new Set();

  const outcomes = new Set();

  if (profile.meta.importedFrom === 'Chrome Trace') {
    outcomes.add('is Chrome trace');
    return outcomes;
  }

  for (let threadIndex = 0; threadIndex < threads.length; threadIndex++) {
    const thread = threads[threadIndex];
    const { markers, stringTable } = thread;
    for (let markerIndex = 0; markerIndex < markers.length; markerIndex++) {
      const nameIndex = markers.name[markerIndex];
      const data = markers.data[markerIndex];
      const name = stringTable.getString(nameIndex);
      if (!data) {
        continue;
      }

      if (!data.type) {
        // const dataIsEmpty = Object.keys(data).length === 0;
        // if (dataIsEmpty && !namesForEmptyDataWithoutType.has(name)) {
        //   console.log(
        //     `Found marker with name "${name}" and empty but non-null data but no data.type, thread index ${threadIndex}, marker index ${markerIndex}`
        //   );
        //   namesForEmptyDataWithoutType.add(name);
        //   outcomes.add(
        //     'has empty but non-null data'
        //   );
        // }
        // if (!dataIsEmpty && !namesForNonEmptyDataWithoutType.has(name)) {
        //   console.log(
        //     `Found marker with name "${name}" and non-null data but no data.type, data: ${JSON.stringify(data)}, thread index ${threadIndex}, marker index ${markerIndex}`
        //   );
        //   namesForNonEmptyDataWithoutType.add(name);
        //   outcomes.add(
        //     'has non-empty data without data.type'
        //   );
        // }
        continue;
      }

      const schema = markerSchemaByName.get(data.type);
      if (schema === undefined) {
        if (
          !_isKnownUndeclaredSchema(data.type, name) &&
          (!namesForUnknownTypes.has(name) || !unknownTypes.has(data.type))
        ) {
          console.log(
            `Found marker with data.type "${data.type}" and name "${name}" for which there is no schema, thread index ${threadIndex}, marker index ${markerIndex}`
          );
          namesForUnknownTypes.add(name);
          unknownTypes.add(data.type);
          outcomes.add('unknown schema for data.type');
        }
      }

      const schemaFields = schema !== undefined ? schema.fields : [];
      for (const fieldKey in data) {
        if (
          fieldKey === 'type' ||
          fieldKey === 'innerWindowID' ||
          fieldKey === 'threadId' ||
          fieldKey === 'cause'
        ) {
          continue;
        }

        const fieldValue = data[fieldKey];

        const fieldSchema = schemaFields.find((f) => f.key === fieldKey);
        if (fieldSchema === undefined) {
          if (
            !_isKnownUndeclaredField(data.type, fieldKey) &&
            (!namesForUndeclaredFields.has(name) ||
              !undeclaredFieldKeys.has(fieldKey))
          ) {
            console.log(
              `Found marker with data.type "${data.type}" and name "${name}" with undeclared field "${fieldKey}": "${fieldValue}", thread index ${threadIndex}, marker index ${markerIndex}`
            );
            namesForUndeclaredFields.add(name);
            undeclaredFieldKeys.add(fieldKey);
            outcomes.add('unknown field');
          }
        }
      }
    }
  }

  return outcomes;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants