]>
Commit | Line | Data |
---|---|---|
9d6b1ce6 DSH |
1 | |
2 | OpenSSL ASN1 Revision | |
3 | ===================== | |
4 | ||
5 | This document describes some of the issues relating to the new ASN1 code. | |
6 | ||
7 | Previous OpenSSL ASN1 problems | |
8 | ============================= | |
9 | ||
10 | OK why did the OpenSSL ASN1 code need revising in the first place? Well | |
11 | there are lots of reasons some of which are included below... | |
12 | ||
13 | 1. The code is difficult to read and write. For every single ASN1 structure | |
14 | (e.g. SEQUENCE) four functions need to be written for new, free, encode and | |
15 | decode operations. This is a very painful and error prone operation. Very few | |
16 | people have ever written any OpenSSL ASN1 and those that have usually wish | |
17 | they hadn't. | |
18 | ||
19 | 2. Partly because of 1. the code is bloated and takes up a disproportionate | |
20 | amount of space. The SEQUENCE encoder is particularly bad: it essentially | |
21 | contains two copies of the same operation, one to compute the SEQUENCE length | |
22 | and the other to encode it. | |
23 | ||
24 | 3. The code is memory based: that is it expects to be able to read the whole | |
25 | structure from memory. This is fine for small structures but if you have a | |
26 | (say) 1Gb PKCS#7 signedData structure it isn't such a good idea... | |
27 | ||
28 | 4. The code for the ASN1 IMPLICIT tag is evil. It is handled by temporarily | |
29 | changing the tag to the expected one, attempting to read it, then changing it | |
30 | back again. This means that decode buffers have to be writable even though they | |
31 | are ultimately unchanged. This gets in the way of constification. | |
32 | ||
33 | 5. The handling of EXPLICIT isn't much better. It adds a chunk of code into | |
34 | the decoder and encoder for every EXPLICIT tag. | |
35 | ||
36 | 6. APPLICATION and PRIVATE tags aren't even supported at all. | |
37 | ||
38 | 7. Even IMPLICIT isn't complete: there is no support for implicitly tagged | |
39 | types that are not OPTIONAL. | |
40 | ||
41 | 8. Much of the code assumes that a tag will fit in a single octet. This is | |
42 | only true if the tag is 30 or less (mercifully tags over 30 are rare). | |
43 | ||
44 | 9. The ASN1 CHOICE type has to be largely handled manually, there aren't any | |
45 | macros that properly support it. | |
46 | ||
47 | 10. Encoders have no concept of OPTIONAL and have no error checking. If the | |
48 | passed structure contains a NULL in a mandatory field it will not be encoded, | |
49 | resulting in an invalid structure. | |
50 | ||
51 | 11. It is tricky to add ASN1 encoders and decoders to external applications. | |
52 | ||
53 | Template model | |
54 | ============== | |
55 | ||
56 | One of the major problems with revision is the sheer volume of the ASN1 code. | |
57 | Attempts to change (for example) the IMPLICIT behaviour would result in a | |
58 | modification of *every* single decode function. | |
59 | ||
60 | I decided to adopt a template based approach. I'm using the term 'template' | |
61 | in a manner similar to SNACC templates: it has nothing to do with C++ | |
62 | templates. | |
63 | ||
64 | A template is a description of an ASN1 module as several constant C structures. | |
65 | It describes in a machine readable way exactly how the ASN1 structure should | |
66 | behave. If this template contains enough detail then it is possible to write | |
67 | versions of new, free, encode, decode (and possibly others operations) that | |
68 | operate on templates. | |
69 | ||
70 | Instead of having to write code to handle each operation only a single | |
71 | template needs to be written. If new operations are needed (such as a 'print' | |
72 | operation) only a single new template based function needs to be written | |
73 | which will then automatically handle all existing templates. | |
74 | ||
75 | Plans for revision | |
76 | ================== | |
77 | ||
78 | The revision will consist of the following steps. Other than the first two | |
79 | these can be handled in any order. | |
80 | ||
81 | o Design and write template new, free, encode and decode operations, initially | |
82 | memory based. *DONE* | |
83 | ||
84 | o Convert existing ASN1 code to template form. *IN PROGRESS* | |
85 | ||
86 | o Convert an existing ASN1 compiler (probably SNACC) to output templates | |
87 | in OpenSSL form. | |
88 | ||
89 | o Add support for BIO based ASN1 encoders and decoders to handle large | |
90 | structures, initially blocking I/O. | |
91 | ||
92 | o Add support for non blocking I/O: this is quite a bit harder than blocking | |
93 | I/O. | |
94 | ||
95 | o Add new ASN1 structures, such as OCSP, CRMF, S/MIME v3 (CMS), attribute | |
96 | certificates etc etc. | |
97 | ||
98 | Description of major changes | |
99 | ============================ | |
100 | ||
101 | The BOOLEAN type now takes three values. 0xff is TRUE, 0 is FALSE and -1 is | |
102 | absent. The meaning of absent depends on the context. If for example the | |
103 | boolean type is DEFAULT FALSE (as in the case of the critical flag for | |
104 | certificate extensions) then -1 is FALSE, if DEFAULT TRUE then -1 is TRUE. | |
105 | Usually the value will only ever be read via an API which will hide this from | |
106 | an application. | |
107 | ||
108 | There is an evil bug in the old ASN1 code that mishandles OPTIONAL with | |
109 | SEQUENCE OF or SET OF. These are both implemented as a STACK structure. The | |
110 | old code would omit the structure if the STACK was NULL (which is fine) or if | |
111 | it had zero elements (which is NOT OK). This causes problems because an empty | |
112 | SEQUENCE OF or SET OF will result in an empty STACK when it is decoded but when | |
113 | it is encoded it will be omitted resulting in different encodings. The new code | |
114 | only omits the encoding if the STACK is NULL, if it contains zero elements it | |
115 | is encoded and empty. There is an additional problem though: because an empty | |
116 | STACK was omitted, sometimes the corresponding *_new() function would | |
117 | initialize the STACK to empty so an application could immediately use it, if | |
118 | this is done with the new code (i.e. a NULL) it wont work. Therefore a new | |
119 | STACK should be allocated first. One instance of this is the X509_CRL list of | |
120 | revoked certificates: a helper function X509_CRL_add0_revoked() has been added | |
121 | for this purpose. | |
122 | ||
123 | The X509_ATTRIBUTE structure used to have an element called 'set' which took | |
124 | the value 1 if the attribute value was a SET OF or 0 if it was a single. Due | |
125 | to the behaviour of CHOICE in the new code this has been changed to a field | |
126 | called 'single' which is 0 for a SET OF and 1 for single. The old field has | |
127 | been deleted to deliberately break source compatibility. Since this structure | |
128 | is normally accessed via higher level functions this shouldn't break too much. | |
129 | ||
130 | The X509_REQ_INFO certificate request info structure no longer has a field | |
131 | called 'req_kludge'. This used to be set to 1 if the attributes field was | |
132 | (incorrectly) omitted. You can check to see if the field is omitted now by | |
133 | checking if the attributes field is NULL. Similarly if you need to omit | |
134 | the field then free attributes and set it to NULL. | |
135 | ||
136 | The top level 'detached' field in the PKCS7 structure is no longer set when | |
137 | a PKCS#7 structure is read in. PKCS7_is_detached() should be called instead. | |
138 | The behaviour of PKCS7_get_detached() is unaffected. | |
139 | ||
140 | The values of 'type' in the GENERAL_NAME structure have changed. This is | |
141 | because the old code use the ASN1 initial octet as the selector. The new | |
142 | code uses the index in the ASN1_CHOICE template. | |
143 | ||
144 | The DIST_POINT_NAME structure has changed to be a true CHOICE type. | |
145 | ||
146 | typedef struct DIST_POINT_NAME_st { | |
147 | int type; | |
148 | union { | |
149 | STACK_OF(GENERAL_NAME) *fullname; | |
150 | STACK_OF(X509_NAME_ENTRY) *relativename; | |
151 | } name; | |
152 | } DIST_POINT_NAME; | |
153 | ||
154 | This means that name.fullname or name.relativename should be set | |
155 | and type reflects the option. That is if name.fullname is set then | |
156 | type is 0 and if name.relativename is set type is 1. | |
157 | ||
158 | With the old code using the i2d functions would typically involve: | |
159 | ||
160 | unsigned char *buf, *p; | |
161 | int len; | |
162 | /* Find length of encoding */ | |
163 | len = i2d_SOMETHING(x, NULL); | |
164 | /* Allocate buffer */ | |
165 | buf = OPENSSL_malloc(len); | |
166 | if(buf == NULL) { | |
167 | /* Malloc error */ | |
168 | } | |
169 | /* Use temp variable because &p gets updated to point to end of | |
170 | * encoding. | |
171 | */ | |
172 | p = buf; | |
173 | i2d_SOMETHING(x, &p); | |
174 | ||
175 | ||
176 | Using the new i2d you can also do: | |
177 | ||
178 | unsigned char *buf = NULL; | |
179 | int len; | |
180 | len = i2d_SOMETHING(x, &buf); | |
181 | if(len < 0) { | |
182 | /* Malloc error */ | |
183 | } | |
184 | ||
185 | and it will automatically allocate and populate a buffer with the | |
186 | encoding. After this call 'buf' will point to the start of the | |
187 | encoding which is len bytes long. |