Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

strncpy is more or less perfect in my line of work where a lot of binary protocols have fixed size string fields (char x[32]) etc.

The padding is needed to make packets hashable and not leak uninitialized bytes.

You just never assume a string is null terminated when reading, using strnlen or strncpy when reading as well.



Yep, that's intended use case for strncpy().

It's not really suitable for general purpose programming like the OP is doing. It won't null terminate the string if the buffer is filled, which will cause you all sorts of problems. If the buffer is not filled, it will write extra null bytes to fill the buffer (not a problem, but unnecessary).

On freebsd you have strlcpy(), Windows has strcpy_s() which will do what the OP needs. I remember someone trying to import strlcpy() into Linux, but Ulrich Drepper had a fit and said no.

You just never assume a string is null terminated when reading, using strnlen or strncpy when reading as well.

Not really possible when dealing with operating system level APIs that expect and require null-terminated strings. It's safer and less error-prone to keep everything null terminated at all times.

Or just write in C++ and use std::string, or literally any other language. C is terrible when it comes to text strings.


> On freebsd you have strlcpy()

strlcpy() came from OpenBSD and was later ported to FreeBSD, Solaris, etc.


Yup.

Lots of good security & safety innovations came from OpenBSD.


You shouldn't use any of those garbage functions. Just ignore \0 entirely, manage your lengths, and use memcpy.


I am not writing in C, but always wondered, why pascal-like strings wrappers are not popular, i. e. when you have first 2 bytes represent the length of the string following by \0 terminated string for compatibility.


2 bytes is not enough, usually you'll see whole "size_t" worth of bytes for the length.

But you could do something utf-8 inspired I suppose where some bit pattern in the first byte of the length tells you how many bytes are actually used for the length.


Pascal originally required you to specify the length of the string before you did anything with it.

This is a totally good idea, but was considered to be too much of a pain to use at the time.


In C you have to do that too, like... malloc()?


You still need a 0-terminated string to pass to API of most libraries (including ones included with the OS - in this case, Win32).


Yeah, Drepper said the same thing.


>It won't null terminate the string if the buffer is filled, which will cause you all sorts of problems.

if you don't know how to solve/avoid a problem like that, you will have all sorts of other problems

pound-define strncopy to a compile fail, write the function you want instead, correct all the compile errors, and then, not only move on with your life, never speak of it again, for that is the waste of time. C++ std:string is trash, java strings are trash, duplicate what you want from those in your C string library and sail ahead. no language has better defined behaviors than C, that's why so many other languages, interpreters, etc. have been implemented in C.


I thought string is just byte array that has Null as last element?

How can a string not Null-terminated ?


Whether the string ends in NULL or not is up to you as a programmer. It's only an array of bytes, even though the convention is to NULL-terminate it.

Well maybe more than just a convention, but there is nothing preventing you from setting the last byte to whatever you want.


Everything in C is just array of bytes, some would argue uint32_t is just array of 4 bytes. That's why we need convention.

A string is defined as byte array with Null at last. Remove the Null and it's not a string anymore.


> Everything in C is just array of bytes, some would argue uint32_t is just array of 4 bytes

That isn't how the C language is defined. The alignment rules may differ between those two types. Consider also the need for the union trick to portably implement type-punning in C. Also, the C standard permits for CHAR_BIT to equal 32, so in C's understanding of a 'byte', the uint32_t type might in principle correspond to just a single byte, in some exotic (but C compliant) platform.

No doubt there are other subtleties besides.


That's only one possible convention, and it's not a particularly good one at that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: