None of the assignments my classmate struggled with required him write code that processes code, so I don't think that was the source of his confusion.
My guess is that it has more to do with the fact that CONS takes a whole list as an argument, then returns a different whole list, which superficially seems too computationally expensive to do in a loop.
-> A symbol can be an identifier and just a piece of data.
Above, the first DEFUN is a standard identifier of the programming language Common Lisp. The second DEFUN is just a symbol.
The first DEFUN form is a program. The second DEFUN expression is just data. Thus it makes no sense to indent the second DEFUN expression like a program. We don't know if it is a program, it could be just anything.
-> A list can be data and a program.
This specific double nature of built-in data structures, here symbols and lists, is already a hurdle to get over.
The developer has to know which lists are data lists, which are forms and which are expressions in inside a form.
(cond ((quote (a))))
->
(cond ((quote (a)))) is a form, which is meant to be evaluated
(a) is data
((quote (a))) is a clause of COND, which is neither data, nor a form to be evaluated
Thus the user has to understand that in Lisp these lists serve different purposes. Which is which depends on the context and the Lisp syntax & evaluation rules.
This then even may make functions possible which contain their own definition as data:
My guess is that it has more to do with the fact that CONS takes a whole list as an argument, then returns a different whole list, which superficially seems too computationally expensive to do in a loop.