Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added sampling to dnscap #15

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
37 changes: 18 additions & 19 deletions src/dnscap.c
Original file line number Diff line number Diff line change
Expand Up @@ -289,16 +289,16 @@ struct plugin {
LIST(struct plugin) plugins;

typedef struct{
UT_hash_handle hh; //makes the structure hashable using "uthash/uthash.h"
unsigned from; //these next 5 fields make up the compound key
int sport,dport,transaction_id;
unsigned to;
UT_hash_handle hh; //makes the structure hashable using "uthash/uthash.h"
uint32_t from; //these next 5 fields make up the compound key
int sport,dport,transaction_id;
uint32_t to;
}samplePacket ;

typedef struct{
unsigned from;
uint32_t from;
int sport,dport,transaction_id;
unsigned to;
uint32_t to;
}sample_lookup_key;

/* Forward. */
Expand Down Expand Up @@ -715,7 +715,7 @@ help_1(void) {
"\t[-x <pat>]+ [-X <pat>]+\n"
"\t[-B <datetime>] [-E <datetime>]\n"
"\t[-P plugin.so] [-U <str>]\n"
"\t[-q <unsinged int>]\n",
"\t[-q <nth>]\n",
ProgramName);
}

Expand Down Expand Up @@ -776,8 +776,8 @@ help_2(void) {
"\t-E <datetime> end collecting at this date and time\n"
"\t-M set monitor mode on interfaces\n"
"\t-D set immediate mode on interfaces\n"
"\t-q <unsigned int> output only every nth DNS query and only output responses\n \
if they correspond to one of the sampled queries\n"
"\t-q <nth> output only every nth DNS query and only output responses\n \
if they correspond to one of the sampled queries\n"
);
}

Expand Down Expand Up @@ -1132,8 +1132,8 @@ parse_args(int argc, char *argv[]) {
usage("the -b option is incompatible with -d and -g");
if (sample && chooseSidesResponse)
{
if (!((dir_wanted & DIR_INITIATE) != 0) && ((dir_wanted & DIR_RESPONSE) != 0))
usage("the -q option is incompatible with -s r");
if(((dir_wanted & DIR_RESPONSE) != 0) && ((dir_wanted & DIR_INITIATE) == 0))
usage("the -q option is incompatible with -s r");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second glance I do not see the reason to have chooseSidesResponse, dir_wanted is set to both direction at start and you really only need to check if DIR_RESPONSE is not set.

}
if (dumptrace >= 1) {
endpoint_ptr ep;
Expand Down Expand Up @@ -2381,10 +2381,9 @@ network_pkt(const char *descr, my_bpftimeval ts, unsigned pf,
ns_msg dnsmsgSample;
ns_initparse(dnspkt,dnslen,&dnsmsgSample);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of libbind and dnscap can be compiled without libbind.

Copy link
Author

@chegger chegger Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean ns_msg and ns_initparse? Is there another way to retrieve the transaction id of a query? I've seen ns_initparse being used in the output functions in dump_dns.c, so I assumed it was safe to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the top #if HAVE_NS_INITPARSE && ..., you will need to disable all the sampling code if libbind is missing or create your own functions for parsing the required information.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'll disable the sampling code if libbind is missing. In future i might try to write my own parsing functions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sampling code is still compiled even if libbind does not exist, checking if sample is true or not does not remove the dependency of the functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but in parse_args() sample is only set to true if libbind does exist, by checking #if HAVE_NS_INITPARSE && HAVE_NS_PARSERR && HAVE_NS_SPRINTRR . Is this checkback not suitable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do know the difference between if() and #if right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do and I just realized where my mistake in thinking was at. I will work on it right away.

samplePacket *currentQuery;
void *ipAddrBufferFrom = &from.u;
void *ipAddrBufferTo = &to.u;
unsigned *fromBuffer = (unsigned*)ipAddrBufferFrom;
unsigned *toBuffer = (unsigned*)ipAddrBufferTo;

uint32_t *fromBuffer = (uint32_t*)&from.u;
uint32_t *toBuffer = (uint32_t*)&to.u;

keylen = offsetof(samplePacket,to) //keylen is used to define which fields of the hash structure are added
+ sizeof(*toBuffer) //as a compound key. Here, the key is composed of all fields between (and including)
Expand All @@ -2395,8 +2394,8 @@ network_pkt(const char *descr, my_bpftimeval ts, unsigned pf,
querycount++;
if(querycount % sampleAmount == 0)
{
currentQuery = malloc(sizeof(*currentQuery));
memset(currentQuery, 0, sizeof(*currentQuery));
currentQuery = calloc(1,sizeof(*currentQuery));
assert(currentQuery != NULL);
currentQuery->from = *fromBuffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are meant to be in samplePacket->from ?
From the struct definition from is just an unsigned and this would essentially put the first unsigned from the fromBuffer into currentQuery. That would be the first part of iaddr.u and that is only the IPv4 address (or a very small part of it) which means this does not work for IPv6.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

samplePacket->from should contain IP addresses (both IPv4 and v6 would be optimal).

On my system unsigned is 32 bits (which is the size of IPv4 adresses), so it should be able to store a full IPv4. I'll change it to uint_32 so it also works on all systems.
As to IPv6, I did not find out how to handle such big numbers properly. I thought it would be better to take a small portion of the IPv6 addresses than not to work with IPv6 at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry you need to find a way to support both v4 and v6 otherwise I will reject the pull request. I do not know how exactly uthash works but maybe you can use the either iaddr which from is.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially used from but I came to the conclusion that it doesn't work properly. I'll give it another try though. If I don't get it to work like this, I'll look for another way to support IPv6 addresses.

In case I can't get it to work with IPv6 at all, I'll inform you, so you can reject this pull request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at plugins/rssm/rssm.c for an example of how to make a hash table with both v4 and v6 addresses ("my_hashtbl").

dnscap_common.h defines 'iaddr' which you should use to store v4/v6 addrs.

rssm.c (and other plugin .c files) have iaddr_hash().

At some point it would make sense to move hashtbl.c and supporting functions out of plugins and into src.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a way that will work perfectly fine. Ill commit it on Monday!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but you have implemented this in a very dangerous way and the use of AMOUNT_32BIT_IN_128BIT is unneeded. You are copying in and out of different structures without any check of structure sizes and seeing that the hash can take binary blobs as keys you should be able to put two iaddr structures within the hash entry for both to and from v4/v6 addresses.

currentQuery->to = *toBuffer;
currentQuery->sport = sport;
Expand All @@ -2409,8 +2408,8 @@ network_pkt(const char *descr, my_bpftimeval ts, unsigned pf,
}
else
{
sample_lookup_key *lookup_key = (sample_lookup_key*)malloc(sizeof(*lookup_key));;
memset(lookup_key, 0, sizeof(*lookup_key));
sample_lookup_key *lookup_key = (sample_lookup_key*)calloc(1,sizeof(*lookup_key));
assert(lookup_key != NULL);
lookup_key->from = *toBuffer;
lookup_key->to = *fromBuffer;
lookup_key->dport = sport;
Expand Down