Comparative Function Arguments

by Skyler Ferris

Table of Contents

Introduction

Programming languages are complex programs that evolve over time with the aim of better serving their users. The most successful programming languages have sophisticated governance systems such as steering committees and RFC-like processes to incorporate feedback from users with on-the-ground knowledge into the language design. These processes embed significant practical knowledge into the programming language.

This paper is a proof of concept for extracting this knowledge from these languages. It examines a selection of programming languages and compares the features they provide around the syntactic specification of function arguments, on both the author and the caller side. It then takes the lessons learned from these disparate languages and creates a unified framework that incorporates these lessons. The purpose is NOT to create a one-size-fits-all "best" solution. Different programming languages are useful in different contexts precisely because they provide different features that suit the context the user is working in. However, the differences between programming languages go far beyond what is necessary for adapting to different contexts. For example, differences in standard library naming conventions, which delimiters to use, and allowable characters in symbol names are cosmetic differences that do not add context-specific value. However, these differences do add mental overhead when programmers context-switch between domains and increase the overall cost of learning to program in different domains. Wouldn't it be nice if we had a core programming language that settled the cosmetic differences but was flexible enough to provide the context-specific features that users benefit from?

Conventions

Footnotes are indicated by an asterisk (*) and can be viewed by hovering your cursor over them. They contain less formal commentary about the paper.

Citations are indicated by an italicized parenthesized nickname, such as (PySource) or (guile-manual). The nicknames give the reader an understanding of where the information is generally coming from while hovering over the citation will provide detailed information for the source and the location in the source where that specific citation is coming from. For example, all citations that say (PySource) refer to the same Python source repository, but each citation will also provide a different filename(s) and line number(s) indicating precisely where the relevant information can be found.

Reproducibility and Transparency

The source code used in example snippets can be found in the online git repository. The repository includes files appropriate for use with GNU Guix to reproduce the software environment I used while creating this paper. It pins to a specific revision of the main Guix channel so that updates do not interfere with reproducibility. Running make shell from the top-level directory launches a pure shell* (Here, "pure" means that enviornment conditions such as variables are unset so that the shell does not use any artifacts from the base operating system by mistake.) for the user so that environmental factors do not interfere with reproducibility.

AI Use Disclosure

I have asked an AI chatbot to give critical feedback on this paper. I took this feedback and incorporated it into future edits exactly as I would have if a human had given me this feedback. AI has not written even a single punctuation mark in this paper.

The commits I asked for feedback on follow:

Features Examined

Some languages provide similar features under different names and sometimes the names that organically grew around a feature is not particularly descriptive. For example, different communities use different terms to refer to the values that shell functions interpret as implying a value. A shell function might recognize a value "--verbose" to mean that a variable named "verbose" should have the value "true". I have heard this mechanism variously referred to as a flag, a switch, or an option. This paper refers to this mechanism as an "implicit value" because the caller uses a name associated with the argument to imply a value which does not appear explicitly. A complete least of terms which describe a feature provided by more than one language follows:

Positional Value (other name(s): N/A):
These values are associated with arguments based on the index of the value in the list of all positional values.

Default Value (other name(s): Optional Argument):
A value that an author associates with an argument for use when the caller declines to supply a value.

Named Value (other name(s): Keyword Argument, Option Argument* (The term "option argument" is not to be confused with the term "optional argument". The former is used by shell users to refer to an argument that come after an option (such as "--ignore" "foo" in the previous example) while an optional argument is a term used by scripters and system programmers to refer to an argument associated with a default value.) ):
These values are associated with an argument based on a name which the caller attaches to the value. The argument may be associated with several synonymous names.

Repeating Value (other name(s): N/A):
Languages which support named values must have some strategy for handling what happens when it receives multiple values for the same argument. In most cases, the strategy is either to error out or overwrite the previous value. However, other strategies such as counting the number of times the value is given (sensible, for example, with implicit values, see below) are possible.

Implicit Value (other name(s)s: Switch, Flag, Option):
An argument has an implicit value if the caller can specify a name associated with the argument but omit the value. For example, many CLI tools recognize "--verbose" to mean that additional output should be provided. An argument with an implicit value may also have antonyms, which invert the semantics of the primary name. For example, some CLI tools also recognize "--quiet" to mean that verbose output should be disabled, and possibly that some non-verbose output should be disabled as well.

Typed Value (other name(s): N/A):
Some languages allow or require associating a type with an argument, in which case the value provided by the caller must be compatible with that type.

Destructuring (other name(s)s: Pattern, Unpacking):
Some languages provide a mechanism to break a composite structure into its component parts. In some cases this functionality is specified by the author, in others by the caller. In both cases the caller provides a value of a specific type and different parts of that value (for example, different elements in a list or different members of a struct) are bound to different local variables in the body of the function. The difference is whether the author decides how the value is destructured (as in the case of Rust's patterns) or the caller does (as in the case of Python's argument unpacking). It is theoretically possible for a language to provide both of these variants simultaneously.

Variadic Function (other name(s): N/A):
A function which can accept an arbitrary number of values. Variadic functions are distinct from arguments with default values because with default values the author controls the range of acceptable argument counts and each value provided by the caller is bound to a different local variable. With variadic functions, the caller can pass in an arbitrarily large number of arguments (physical limitations notwithstanding) and the values are typically aggregated into a container; the variable bound to this container is the variadic variable. Variadic values are values which are associated with this feature. For example, a variadic function may require one positional value at the beginning of the value list. The first value is not a variadic value but all other values are.

Each of these features is involved with one or more concerns:

Argument Binding:
These features are concerned with the manner in which values provided by the caller are associated with arguments specified by the author. Features with this concern include Positional Values, Named Values, Repeating Values, Destructuring, and Variadic Functions.

Variable Specifications:
These features are concerned with how authors specify the local variables that are associated with arguments. Features with this concern include Destructuring and Typed Values.

Value Specifications:
These features are concerned with the ways that callers can provide values which will be associated with arguments. Features with this concern include Default Values, Repeating Values, Destructuring, and Implicit Values.

Value Constraints:
These features are concerned with restricting the set of values that callers can associate with a specific argument. The only feature with this concern is Typed Values.

Structure of Paper

The remainder of this paper will be divided into 3 parts: Description, Analysis, and Implementation. The purpose of these parts is to first understand what the current state of the craft is, then assess the advantages and disadvantages of different approaches, and finally create a framework that allows programmers to make context-aware decisions in an interoperable manner.

Description

Overview

This part is dedicated to describing the features of various languages that are sampled by this paper, using the terms defined by this paper. Each of these sections will start with a summary of the features provided by the language. Next, it will explain the background knowledge necessary in order to understand the feature descriptions. This explanation will include an example of a "simple function and call" which serves two purposes. First, it provides the reader with a concrete example of what will be described in the context of this language. For example, the description in Python will focus on the behavior of the interpreter and the bytecode emitted by the compiler. Second, it provides a point of comparison when describing features. In Python, the description of destructuring primarily describes the CALL_FUNCTION_EX instruction, with some description of list-building instructions towards the end, rather than describing every single instruction that is emitted. This keeps each section shorter because they do not have to explain the baseline they are being compared to; the sections are not self-contained, but they all have exactly one dependency and the dependency plus the section are self-contained. This makes it easier for a reader to focus only on the features that are relevant to them, if they so choose. Finally, each of the features provided by the language will be described.

Each feature description will start either with a statement that the feature is not provided or a statement of how the feature is provided. For example, in Python positional parameters are the "default behavior". Tutorials (including the official tutorial contained in the repository) commonly introduce functions by using positional parameters and introduce "keyword arguments" as a separate feature at a later point. Therefore, the Python section about positional values starts with "provided by default" while the section about named values starts with "provided through keyword arguments". This helps the reader understand the feature's relationship to the language and clarifies what terminology will be useful if they are reading language documentation or performing an internet search. After this statement, any distinctive qualities of the feature as provided by the language will be noted.

Finally, there will be 3 subsections: Syntax, Implementation, and Historical Record. All 3 of these sections inform the implementation of the mechanism.

The syntax section explains what the source code looks like when the feature is used. This clarifies what the feature looks like in practice.

The implementation section explains the language behavior which causes the feature to be provided. This suggests implementation strategies that might be useful in the framework.

The historical record section discusses, where possible, the motivation for the feature and lessons learned from implementation and community response. This helps the framework avoid or solve problems that others have dealt with in the past.

Bash

Bash, along with getopt, provides the following features:

Background

Bash has an interpreter which executes plain-text documents (bash-5.1.16-source) doc/bash.info lines 111-122 Bash Source. 5.1.16. SHA256: 5bac17218d3911834520dad13cd1f85ab944e1c09ae1aba55906be1f8192f558 GNU Project. https://ftp.gnu.org/gnu/bash/bash-5.1.16.tar.gz . Unlike most languages (and all other languages examined in this paper), bash does not expect or allow function authors to specify arguments in the function signature. Instead, Bash makes all functions variadic and binds each argument to a special variable named after the position - the first argument will be bound to $1, the second to $2, etc. Note that the $ simply indicates the start of a variable (instead of a plain string) in Bash.

This paper also assesses the command-line utility getopt, specifically the version provided by GNU which has features beyond what other implementations provide. I chose to include the use of this utility because it enjoys widespread use and it is more interesting to assess the feature-rich utility than the relatively featureless facilities provided by the core language (positional arguments and variadic functions only). Getopt has 2 different concepts when it comes to arguments, which it refers to as options and arguments. In the nomenclature of this paper, options are names and arguments are values. Getopt further distinguishes between option arguments, values that are associated with a name, and non-option arguments, values that are associated with a position.

Furthermore, getopt does not do variable binding in the traditional manner that we will see in other languages in this paper. Instead, it supports future processing by normalizing the form of the arguments by ordering them and separating them. It organizes them so that all named values appear first, then a literal --, then all positional values. It separates them by taking named values which are specified by a single value, by using the form --name=value, and separating them into two separate values, --name value. This organization makes it straightforward to process named values using the built-in while, case, and shift constructs. Positional values can then be referenced by using their position name, $1, $2, etc.

The implementation of getopt is split across two repositories: GLibc provides c-level functions which implement functionality (usable by c-level programs to process command-line arguments) and Util-Linux provides a wrapper around these functions to make them accessible directly from the command-line.

The GLibc function getopt_long is the entry point. It accepts an array of argument values along with metadata to control how they are processed (for example, valid names are passed through a control structure). It expects the user to call it repeatedly in order to process all arguments and performs 2 processes while scanning the argument list. One process is to immediately indicate the next value name (option) through the return value and return the value (option argument), when present, through a global variable. When this process is finished it returns a special value to indicate such. The second process (which occurs simultaneously with the first process) is to reorganize the given list of values so that all named values precede all positional values.

The Util-Linux wrapper calls getopt_long and prints the return values in a normalized form.

Simple Function and Call

			function add-values {
				getopt -o "" -- "$@"
				eval set -- $(getopt "" -- "$@")
				shift
				echo $(($1 + $2))
			}
			
			add-values 1000 1001
			

The first line in the function runs getopt, but does not have any side-effects; it simply prints the organized and normalized arguments for examination in the output. The second line is the idiomatic way to invoke getopt; it uses the output of getopt to rewrite the argument list so that it takes effect. The third line shifts the argument list; that is, it moves all positional arguments to the left one position so that $3 becomes $2, $2 becomes $1, and $1 is dropped. It is idiomatic to call shift before processing positional arguments so that the literal -- inserted by getopt does not cause confusion. The final line uses the rewritten argument list to perform the addition.

Calling the function results in the following output:

			 -- '1000' '1001'
			2001
			

Positional Value

Bash supports positional arguments through the use of special variables $1, $2, etc. Note that, without further processing by the author, named values and their names will appear as positional values. It is idiomatic to use the built-in shift function when processing named values in order to avoid this issue. Getopt facilitates this by ensuring that all named values preceed all positional values, and divides the two groups with a literal --.

Syntax & Semantics

The syntax and semantics of positional values do not differ from the simple call.

Implementation

getopt_long recognizes that a value is not a name because it either does not start with a -, or because it is only one character long (that is, the value is exactly the string "-") (glibc-source) posix/getopt.c line 490
GLibc Source. 2.35. SHA256: 5123732f6b67ccd319305efd399971d58592122bcc2a6518a1bd2510dd0cf52e GNU Project. https://ftp.gnu.org/gnu/glibc/glibc-2.35.tar.xz
. Values which are named are automatically processed when the name is processed, so they will not be mistaken for positional values (glibc-source) Long options:
posix/getopt.c lines 573, 595
posix/getopt.c lines 351-365

Short options:
posix/getopt.c lines 653-696

GLibc Source. 2.35. SHA256: 5123732f6b67ccd319305efd399971d58592122bcc2a6518a1bd2510dd0cf52e GNU Project. https://ftp.gnu.org/gnu/glibc/glibc-2.35.tar.xz
. Values which are named are moved to the beginning of the array whenever a new value starts being processed (glibc-source) posix/getopt.c lines 504-520
GLibc Source. 2.35. SHA256: 5123732f6b67ccd319305efd399971d58592122bcc2a6518a1bd2510dd0cf52e GNU Project. https://ftp.gnu.org/gnu/glibc/glibc-2.35.tar.xz
.

Historical Record

Bash has used positional parameters in the above manner since at most version 1.14.7, the oldest version available in the referenced repository (bash-1.14.7-source) documentation/features.texi lines 243-248.
Bash Source. 1.14.7. Commit: 726f63884db0132f01745f1fb4465e6621088ccf. GNU Project. https://git.savannah.gnu.org/git/bash.git
.

In the getopt C function specified by POSIX, arguments are not reorganized so that named values appear prior to positional values. Instead, the standard assumes that the caller provides all named values before positional ones and stops processing at the first positional value found. The sentinal value -- to indicate the beginning of positonal arguments was included in the standard (POSIX.2-1992) pages 732-733.
Portable Operating System Interface (POSIX). 1003.2-1992. Institute of Electrical and Electronics Engineers. 1992.
. The argument reorganization has been a GNU extension since the oldest version of GLibc available in the referenced repository (GLibc-first-commit) posix/getopt.c lines 73-85
GLibc Source. Commit 28f540f45bbacd939bfd07f213bcad2bf730b1bf. GNU Project. https://sourceware.org/git/glibc.git
.

Default Value

While default values are not explicitly supported by getopt, they are trivial to implement because of the manual processing that getopt expects the programmer to perform.

Syntax & Semantics

Default values do not have first-class support in Bash and Getopt, so there is no specific syntax for specifying default values. However, it is trivial and common for shell scripters to support default values so this will be examined in the implementation section.

Implementation

Default values can be implemented by initializing local variables to default values, then updating them if a new value is given through an argument. For example, the following code provides a default value of 0 for both $A and $B.

			function add-values {
				getopt -o "" -- "$@"
				eval set -- $(getopt "" -- "$@")
			
				A=0
				B=0
			
				# Move past the literal --
				shift
			
				# The code [ -n "$1" ] checks if the variable contains a value
				if [ -n "$1" ]; then
					A="$1"
					shift
				fi
			
				if [ -n "$1" ]; then
					B="$1"
					shift
				fi
			
				echo $(($A + $B))
			}
			
Historical Record
Bash and getopt have never provided first-class support for default values.

Named Value

Syntax & Semantics
Implementation
Historical Record

In the POSIX specification, only argument names with a single letter are allowed and argument names may only be prefixed with a single hyphen (POSIX.2-1992) pages 732-733.
Portable Operating System Interface (POSIX). 1003.2-1992. Institute of Electrical and Electronics Engineers. 1992.
. The ability to use multicharacter option names with a double hyphen has been a GNU extension since the oldest commit available in the referenced repository (GLibc-first-commit) posix/getopt.c lines 354-361
GLibc Source. Commit 28f540f45bbacd939bfd07f213bcad2bf730b1bf. GNU Project. https://sourceware.org/git/glibc.git
.

Repeating Value

Syntax & Semantics
Implementation
Historical Record

Implicit Value

Syntax & Semantics
Implementation
Historical Record

Typed Value

Syntax & Semantics
Implementation
Historical Record

Caller Destructing

Syntax & Semantics
Implementation
Historical Record

Author Destructuring

Syntax & Semantics
Implementation
Historical Record

Variadic Function

Syntax & Semantics
Implementation
Historical Record

Python

Python provides the following features:

Python used to provide author destructuring for tuples, but this was removed in version 3.0.

Some bytecode instructions include hints in parentheticals after the instruction. For example, LOAD_NAME will include the name being loaded:

			LOAD_NAME  0 (add_values)
			
While LOAD_CONST will include the literal value:
			LOAD_CONST 1 (1000)
			
These hints are helpfully provided by the Python decompiler.

Background

Python has a compiler which produces bytecode (PDG) internals/compiler.rs section "Abstract"
Python Developer Guide. Commit: 323f9cf9438730fb64ed71b40a3cb343b6724841. Python Software Foundation. https://github.com/python/devguide
and an interpreter which executes the bytecode (PDG) internals/interpreter.rs section "Introduction"
Python Developer Guide. Commit: 323f9cf9438730fb64ed71b40a3cb343b6724841. Python Software Foundation. https://github.com/python/devguide
. Argument and return values are given using a stack managed by the interpreter; instructions may modify or move these values, even if this is not the primary purpose of the instruction (PDG) internals/interpreter.rst section "The Evaluation Stack"
Python Developer Guide. Commit: 323f9cf9438730fb64ed71b40a3cb343b6724841. Python Software Foundation. https://github.com/python/devguide
.

There are 2 families of instructions that are used throughout the examples in this section. The LOAD family puts values on the stack from different locations depending on the instruction. The CALL family initiates a function call after the stack has been prepared. There are also example-specific instructions which will be explained alongside the relevant example.

Note: This section omits bookkeeping instructions that are not topically relevant. For example, when calling a non-method function (one which is not associated with an object instance), the interpreter pushes NULL onto the stack before pushing the function. This instruction, and similarly uninteresting instructions, are omitted for brevity.

LOAD family

Instructions prefixed with LOAD_ retrieve a value from an instruction-dependent location and put it onto the stack. Each instruction recieves an integer which represents an index into a C-level array. Which array is referenced depends on the instruction. When the array contains variable names, the instruction also retrieves the value associated with that name. The below table explains the contents of the array that each instruction references.

LOAD_CONST Constant values which appear literally or implicitly in source code (PySource) Doc/library/dis.rst lines 964-966
Python/ceval.c lines 1314-1536
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
LOAD_FAST Names of local variables which are guaranteed to be initialized. (PySource) Doc/library/dis.rst lines 1253-1259,
Lib/inspect.py line 514
Python/ceval.c lines 1314-1536
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
LOAD_NAME Names of non-local variables. If a local variable exists with the same name as a non-local variable then the value bound to the local variable will be returned. (PySource) Doc/library/dis.rst lines 969-972
Lib/inspect.py line 511
Python/ceval.c lines 1314-1536
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
CALL family

These instructions tell the interpreter to call a function. This paper views this family as rooted in the plain CALL instruction, with all others being variants on this core instruction. When it needs to reference a behavior which occurs when a function is called, it examines only the plain CALL instruction and assumes that other instructions behave similarly unless the purpose of the variant is to change that specific behavior.

CALL Receives an integer indicating the number of argument values provided by the caller. The stack will contain the function to call followed by the argument values in separate stack locations. (PySource) Doc/library/dis.rst lines 1398-1410
Python/ceval.c lines 1314-1536
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython

Simple Function and Call

			def add_values(a, b):
			    return a + b
			
			add_values(1000, 1001)
			

The bytecode generated for the call to add_values performs 3 tasks. First, it pushes the function onto the stack. Next, it pushes literal values which will become associated with arguments. Finally, it calls the function.

			LOAD_NAME  0 (add_values)
			LOAD_CONST 1 (1000)
			LOAD_CONST 2 (1001)
			CALL       2
			

The bytecode generated for the definition is similar. It uses LOAD_FAST (instead of LOAD_NAME) to refer to the variables associated with arguments and BINARY_OP (instead of CALL) to use the built-in + operator.

			LOAD_FAST    0 (a)
			LOAD_FAST    1 (b)
			BINARY_OP    0 (+)
			

Positional Value

Provided by default. Generally, callers can choose whether to provide values by name or position when they make the call. All positional values must precede all named values (PyDoc) reference/expressions.html section 6.3.4 "Calls"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
. Function authors can specify that some arguments with only receive their value by position. These are referred to as "positional-only arguments".

Syntax & Semantics

Callers specify positional values by providing a comma-separated list of values. Function authors specify positional-only arguments by listing a literal / after the final positional-only argument (PyDoc) reference/compound_stmts.html section 8.7 "Function Definitions"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
.

			def add_values_mixed(position, /, either):
			    return position + either
			
			# valid
			add_values_mixed(1000, 1001)
			add_values_mixed(1000, b=1001)
			
			# invalid: the value for argument
			# "position" cannot be given by name
			# add_values_mixed(position=1000, either=1001)
			
Implementation

The bytecode for function definitions is identical to the simple definition regardless of the presence of positional-only arguments. The restriction is enforced within the CALL instruction. In particular, the helper function positional_only_passed_as_keyword uses the co_posonlyargcount and co_localsplusnames members of the code object. These variables track the number of positional-only arguments (PySource) Doc/library/inspect.rst lines 180-181
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
and the names of all arguments (PySource) Include/cpython/code.h line 155
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
respectively. If any names of positional-only arguments appear as keyword arguments then the helper raises an error. (PySource) Python/ceval.c lines 1182-1244
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
. Note that the helper is only called if the function does not include a variadic variable for named values (PySource) Python/ceval.c lines 1417-1431
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

The bytecode for positional values is identical to the bytecode for the simple call. Python stores the values of local variables in the C-level array localsplus. The CALL instruction copies positional arguments from the stack into this array (PySource) Python/ceval.c lines 1341-1353
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.
Historical Record

Positional values have always been available in Python and requires no special syntax to use* Unfortunately, positional values are assumed to be the default method of passing arguments by most programmers, including language authors, so there is no good citation for this assertion. .

PEP 570 introduced positional-only arguments (PEPs) peps/pep-0570.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. It gives several justifications for the change, most of which are concerned with maintaining a healthy ecosystem. There are two relevant* There are also several concerns mentioned which are specific to Python and/or its implementation. For example, it references PEP 399 which requires that pure Python code and C extensions have the same expressive power. While important to the Python community, these concerns are not relevant to this paper. ecosystem harms the PEP is concerned with: inappropriate use of value names by callers and increased maintenance burden for library authors.

Inappropriate use of value names includes using non-meaningful names, such as a math function that takes one argument (the sqrt function takes one argument named x). It also includes providing values in an illogical order, such as calling the range function and supplying the stop value before the start value.

The increased maintenance burden occurs because all argument names are automatically and irrevocably added to the API surface of all libraries. It could be the case that a library author wants to implement a change which should be non-breaking in principle, but prompts a variable name change for clarity. This variable name change transforms the overall change into a breaking change.

The PEP is also concerned with functions that include a variadic variable for named parameters. For these functions, any non-variadic variables restrict the domain of the variadic variable, as their names will be associated with the distinct variable rather than the variadic one.

Finally, the PEP notes the curious case of the range function, which the PEP describes as accepting "an optional parameter to the left of its required parameter." In particular, if the range function only receives a single argument it is interpreted as the end of the range, but if it receives 2 arguments then the first is interpreted as the start while the second is interpreted as the end. This concern does not appear to be addressed by PEP 570* At least, I do not see anything that addresses it when I read the PEP and the implementation of range still inspects the number of provided arguments manually (PySource) Objects/rangeobject.c lines 81-120
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.
.

Default Value

Provided by default argument values. Default values are evaluated at definition time and evaluated exactly once (PyDoc) tutorial/controlflow.html section 4.8.1 "Default Argument Values"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip

Syntax & Semantics

Function authors can define a default value by adding a literal = after the name of an argument, then the value (PyDoc) reference/compound_stmts.html section 8.7 "Function Definitions"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip

			def add_values(mandatory, optional=2000):
			    return mandatory + optional
			
			# valid
			add_values(1000)
			add_values(1000, 1001)
			
			# invalid: the first argument is not
			# associated with a default value
			# add_values()
			
The value may refer to a variable, but the value contained by the variable at definition time will always be used as the default value:
			value = 2000
			
			def add_values(mandatory, optional=value):
				return mandatory + optional
			
			value = 1000
			
			add_values(0) # returns 2000
			
Implementation

Default values do not impact the bytecode generated for the definition or the call: they are both identical to the simple call. Instead, the CALL instruction retrieves default values from the code object and uses them when necessary (PySource) Python/ceval.c lines 1314-1536
Trace from CALL implementation to initialize_locals definition: Call to _PyEvalFramePushAndInit Python/bytecodes.c lines 2689-2691 Call to initialize_locals Python/ceval.c line 1594 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython

Historical Record

Default values were added in version 1.0.2 (PySource) Misc/HISTORY lines 32809-32811
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
There is additionally a note that default values "would now be quite sensible" in the version 0.9.4 release notes. This version changed argument processing so that functions receive all arguments as separate values, rather than as a single tuple (PySource) Misc/HISTORY lines 34550-34639
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

PEP 671 proposes adding a feature which would allow function authors to provide an expression which will produce a default value at call time ("late evaluation") (PEPs) peps/pep-0671.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. The mailing list discussion includes several disagreements, including whether or not it is appropriate for a function signature to contain un-inspectable objects and technical difficulties about scoping rules for late evaluated values. The proposal is still in the "draft" state, so it might be added in the future (possibly after trivial or significant changes to the proposal), but there has been no activity on the mailing list since 2021. (PEP 671 Discussion) PEP 671 (late-bound arg defaults), next round of discussion. TODO: list authors. https://mail.python.org/archives/list/python-ideas@python.org/thread/UVOQEK7IRFSCBOH734T5GFJOEJXFCR6A/

Named Value

Provided through keyword arguments. Generally, callers can choose whether to provide values by name or position when they make the call. All named values must proceed all positional values (PyDoc) reference/expressions.html section 6.3.4 "Calls"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
. Function authors can specify that some arguments will only receive their value by name. These are referred to as "keyword-only arguments".

Syntax & Semantics

Callers provide named values by writing first a symbolic name, then a literal =, then the value. (PyDoc) reference/expressions.html section 6.3.4 "Calls"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
. Function authors specify keyword-only arguments by listing a literal * before the first keyword-only argument (PyDoc) reference/compound_stmts.html section 8.7 "Function Definitions"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
.

			def add_values_mixed(either, *, named):
			    return either + named
			
			# valid
			add_values_mixed(named=1001, either=1000)
			add_values_mixed(1000, named=1001)
			
			# invalid: the value for argument "named"
			# must be given by name
			# some_function(1000, 1001)
			
			# invalid: named values cannot appear
			# before positional values
			# some_function(named=1001, 1000)
			
Implementation

The bytecode for function definitions is identical regardless of the presence of keyword-only arguments. The restriction is enforced within the CALL instruction. In particular, the helper function initialize_locals checks that the number of positional arguments is not more than expected (PyDoc) Python/ceval.c lines 1458-1462, trace_call_TO_initialize_locals
Trace from CALL implementation to initialize_locals definition:
Call to `_PyEvalFramePushAndInit` Python/bytecodes.c lines 2689-2691
Call to `initialize_locals` Python/ceval.c line 1594
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
by checking the co_argcount member which tracks the number of arguments which may be positional (PySource) Doc/library/inspect.rst lines 146-149
Python/assemble.c line 556
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

Bytecode for calls which use named values differ significantly from the simple call. For example, consider this code:

			def add_values(a, b):
			    return a + b
			
			add_values(b=1001, a=1000)
			add_values(1000, b=1001)
			

The bytecode for the first call differs from the simple call by adding the KW_NAMES instruction prior to the CALL instruction:

			LOAD_NAME                1 (add_values)
			LOAD_CONST               6 (1001)
			LOAD_CONST               5 (1000)
			KW_NAMES                 8 (('b', 'a'))
			CALL                     2
			
KW_NAMES marks the given constant, in this case the tuple ('b', 'a'), as a set of argument names to be used by CALL (PySource) Python/bytecodes.c lines 2601-2605, 2644, 2869-2692, 2706-2709
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
Then, the CALL instruction determines which values belong to which arguments by corresponding their respective positions on the stack and in the tuple (PySource) Python/ceval.c lines 1383-1384
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

The process is similar when some values are provided by position and others by name. The second call above does not have any additional instructions to handle this case:

			LOAD_NAME                2 (add_values_mixed)
			LOAD_CONST               5 (1000)
			LOAD_CONST               6 (1001)
			KW_NAMES                12 (('b',))
			CALL                     2
			

The CALL instruction infers which value is named based on the restriction that positional values must precede named values (PySource) Python/ceval.c lines 1383-1384
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython

Historical Record

Named values were first introduced in Python 1.3 (Py1.3Source) Doc/tut.tex lines 3540-3626
Python Source. 1.3. SHA256: 892a12d7360b5e6aed0f6a637f5dbdd4ea9a27ee0e2741337393b1d5711d2d33 Python Software Foundation. https://legacy.python.org/download/releases/src/python-1.3.tar.gz
The feature was based on the similar feature provided by Modula-3 (Py1.3Source) Doc/tut.tex lines 3584-3586
Python Source. 1.3. SHA256: 892a12d7360b5e6aed0f6a637f5dbdd4ea9a27ee0e2741337393b1d5711d2d33 Python Software Foundation. https://legacy.python.org/download/releases/src/python-1.3.tar.gz
. While keyword-only arguments (discussed later in this section) were added afterwards, the core syntax and semantics of keyword-only arguments have remained unchanged.

PEP 3102 introduced keyword-only arguments. It provides a single justification for the change: variadic functions cannot make use of default values. The PEP gives the following example:

			def sortwords(*words, case_sense=False):
			    pass
			

If the value associated with case_sense can be provided positionally then it must be provided in every call even if the caller wants the default value of False* Unless the caller wants to sort the empty list. =) .

Repeating Value

Python produces an error when a value is repeated.
Syntax & Semantics
A repeated value is provided by naming 2 different values with the same name:
			add_values(a=1000, b=1001, a=1002)
			
This produces an error:
			SyntaxError: keyword argument repeated: a
			
Implementation
Python checks for repeating arguments statically at compile-time. The compiler_call helper function is used to emit instructions related to a function call (PySource) Python/compile.c line 6126 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython . The first thing this function does is call the validate_keywords function (PySource) Python/compile.c line 4914 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython . This function manually compares each name to each proceeding name and returns an error if a duplicate is found (PySource) Python/compile.c lines 4900-4905 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython .
Historical Record
The explicit check for duplicate value names was added in version 1.5a3 (PySource) Misc/HISTORY lines 30752-30753 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython ; it is not clear what behavior Python exhibited before this change * Unfortunately, Python's 1.4 source does not compile on modern compilers with many errors such as "GCC no longer supports <varargs.h>", so I was not able to test the behavior at this version. .

Implicit Value

This feature is not provided by Python.

Typed Value

Provided through type hinting. The Python compiler and interpreter do not change their behavior based on type hints. However, they do guarantee that the hints will be available to external tools and provide supporting infrastructure to help the tools work correctly. Both static analyzers and runtime checkers can make use of annotations.

Syntax & Semantics

Type hints are specified using function annotations, as defined in PEP 3107. This means that function authors add a colon and type name after the variable name associated with the argument:

			def typed_argument(x: str):
			  pass
			
Implementation

Annotations are stored as metadata in the Python object which represents the function. Libraries can access them through the __annotations__ property, which contains a dictionary (PEPs) peps/pep-3107.rst "Accessing Function Annotations"
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
.

Historical Record

The foundations for type hinting were added in PEP 3102, which defines the syntax for function annotations (PEPs) peps/pep-3102.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. The Python developers then waited for external community-driven tools to experiment with different type-checking approaches. Eventually, they took lessons learned from the community and created a set of recommendations in PEPs 482, 483, and 484 (PEPs) peps/pep-0484.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. Much of their content addresses type theory issues, such as generics, variance, and special types like Any. Since this initial introduction there have been a number of PEPs which further clarify best practices or provide syntactic improvements to type specifications. However, the core mechanism that this paper is concerned with - associating a type with an argument, regardless of how that type is specified - remains unchanged.

Caller Destructing

Provided through argument unpacking. Caller destructuring is only available for iterables and mappings.

Syntax & Semantics

This feature allows a caller to prefix one or more iterables with * in order to translate their contents into a set of positional values, and/or prefix one or more mappings with ** to translate their contents into a set of named values. For mappings, keys must strings naming an argument. (PyDoc) reference/expressions.html section 6.3.4 "Calls"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip

			def add_values(a, b, c):
			    return a + b + c
			
			# all of the below calls are equivalent
			# to this:
			# add_values(1000, 1001, 1002)
			
			# destructure an iterable into positional
			# values
			l = [ 1000, 1001, 1002 ]
			add_values(*l)
			
			# destructure multiple iterables into
			# positional values
			first_part = [ 1000 ]
			second_part = [ 1001, 1002 ]
			add_values(*first_part, *second_part)
			
			# destructure a mapping into named
			# values
			d = { 'a': 1000, 'b': 1001, 'c': 1002 }
			add_values(**d)
			
			# destructure multiple mappings into
			# named values
			first_part = { 'b': 1001 }
			second_part = { 'a': 1000, 'c': 1002 }
			add_values(**first_part, **second_part)
			
Implementation

The difference between the simple call and a call which includes destructuring is best explained by starting with the final instruction. While the simple call uses the plain CALL instruction a destructuring call uses the CALL_FUNCTION_EX instruction. CALL_FUNCTION_EX receives either 0 or 1 which tells it whether or not there is a mapping to destructure (PySource) Doc/library/dis.rst lines 1398-1410
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

When it receives 1, there is a mapping to destructure which will be on the top of the stack. While the caller can use any mapping (and any number of mappings), CALL_FUNCTION_EX will always see a single dictionary when it executes (the process which ensures this is discussed in more detail later in this section). The dictionary is turned into a set of keyword arguments by interpreting the keys as names identifying arguments. (PySource) Objects/call.c lines 1029-1053
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython

The next item on the stack is an iterable to destructure. In this case, CALL_FUNCTION_EX might see any iterable on the stack. If the iterable is not a tuple it will convert it into a tuple (PySource) Python/bytecodes.c lines 3198-3207
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
. The elements of this tuple will be used as positional values. (PySource) Python/bytecodes.c line 3219
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
.

When CALL_FUNCTION_EX receives 0 the process is similar, except that the top element of the stack is an iterable and there is no mapping.

The compiler ensures that CALL_FUNCTION_EX only receives dictionaries (rather than the arbitrary mapping object provided by the caller) with two instructions. First, it issues a BUILD_MAP instruction to place a new dictionary on the stack (PySource) Doc/library/dis.rst lines 1015-1023
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
Then it adds the keys and values of each mapping object into this dictionary by repeatedly calling the DICT_MERGE instruction. For example, this code:

			add_values(*d)
			

Compiles to this bytecode (note that BUILD_MAP receives the value 0 to indicate that it is building an empty dictionary):

			LOAD_NAME         0 (add_values)
			LOAD_CONST       13 (())
			BUILD_MAP         0
			LOAD_NAME         3 (d)
			DICT_MERGE        1
			CALL_FUNCTON_EX  1
			

When named values are provided separately from the destructured values, the freshly created dictionary is prepopulated with those values. For example, this code:

			d = { 'b': 1001, 'c': 1002 }
			add_values(a=1000, **d)
			

Compiles to this bytecode (note that in this case, BUILD_MAP receives the value 1 to indicate that there is one key-value pair on the stack):

			LOAD_NAME         1 (add_values)
			LOAD_CONST       17 (())
			LOAD_CONST       11 ('a')
			LOAD_CONST        3 (1000)
			BUILD_MAP         1
			LOAD_NAME         4 (d)
			DICT_MERGE        1
			CALL_FUNCTION_EX  1
			

When the caller provides multiple destructured iterables, or provides literal positional values in addition to one or more destructured iterables, the compiler issues instructions to merge them into a list, then converts that list into a tuple. For example, this code:

			t0 = ( 1001, )
			t1 = ( 1002, )
			add_values(1000, *t0, *t1)
			

Compiles to this bytecode:

			LOAD_NAME        1 (add_values)
			LOAD_CONST       3 (1000)
			BUILD_LIST       1
			LOAD_NAME        4 (t0)
			LIST_EXTEND      1
			LOAD_NAME        5 (t1)
			LIST_EXTEND      1
			CALL_INTRINSIC_1 6 (INTRINSIC_LIST_TO_TUPLE)
			CALL_FUNCTION_EX 0
			

If the caller provides only a single iterable to destructure, and no literal positional values, this iterable is placed onto the stack without modification and the tuple creation logic contained within CALL_FUNCTION_EX itself is triggered (PySource) Python/bytcodes.c line 3202 Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython .

Historical Record

When argument unpacking was first introduced in version 1.6 (PySource) Misc/HISTORY lines 26740-26743
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
it only allowed callers to unpack a single iterable and/or a single mapping. For example, the call add_values(*first_part, *second_part) would have been illegal. PEP 448 expanded argument unpacking so that multiple values can be destructured in the same call (PEPs) peps/pep-0448.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. The rationale given for this change was enhanced readability, as previously callers would either need to build iterables/dictionaries separately or destructure them manually, adding additional lines of code which are semantically sparse.

Author Destructuring

While python does not currently support authorial destructuring, it did so prior to version 3.0 (PEPs) peps/pep-3113.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
. It allowed authors to declare that arguments should receive tuples whose values would be bound to separate local variables:

			def distance((x1, y1), (x2, y2)):
			  pass
			

This function would require that callers pass in 2 values which are both tuples containing 2 elements. The values from the first tuple would be bound to local variables x1 and y1, while the values from the second would be bound to x2 and y2.

The functionality was removed through PEP 3113. The rationale includes multiple implementation issues which are important to the Python community but not relevant to this paper.

Variadic Function

Provided by arbitrary argument lists and dictionaries. Positional values are collected by the former while named values are collected by the latter.

Syntax & Semantics

Function authors specify variadic-ness by specifying the name for one or two variadic variables. The name for the variadic list must be prefixied by a * while the name for the variadic dictionary must be preceeded by a ** (PyDoc) reference/compound_stmts.html section 8.7 "Function Definitions"
Python Documentation. 3.12.2. SHA-256: d600427f22db970ddf4f5699fdb5a8ebda8e9e92ddb09af807bbeabf07e64df6. Python Software Foundation. https://docs.python.org/3.12/archives/python-3.12.2-docs-html.zip
.

			from itertools import chain
			
			def add_values(*pos_vals, **named_vals):
				return sum(chain(pos_vals, named_vals.values()))
			
			# All of these values appear in the
			# pos_values list
			add_values(1000, 1001, 1002, 1003)
			
			# All of these values appear in the
			# named_values dictionary
			add_values(named_val0=1000,
			           named_val1=1001,
			           named_val2=1002,
			           named_val3=1003)
			
			# The values 1002 and 1003 appear in the
			# pos_vals list while the names and
			# values named_arg0=1000 and
			# named_arg1=1001 appear in the
			# named_vals dictionary
			add_all_values(1002,
			               1003,
			               named_val0=1000,
			               named_val1=1001)
			
Implementation

The bytecode for variadic calls does not differ from the simple call (or a call with named values, when relevant). Similarly, the bytecode for the definition of variadic functions does not differ except implicitly in the way that the variables are used (eg, as lists or dictionaries rather than plain values). The interpreter tracks which positional values are also variadic values by checking the co_argcount variable associated with the function's code object. Remaining positional arguments are moved into the appropriate variadic variable, if it exists (PySource) Python/ceval.c lines 1355-1376
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython
It distinguishes variadic named values from non-variadic named values by checking if the name is expected; the interpreter already has to keep track of this information because an unrecognized value name is considered an error for non-variadic functions (PySource) Python/ceval.c lines 1378-1455
Python Source. 3.12.2. Commit: 6abddd9f6afdddc09031989e0deb25e301ecf315. Python Software Foundation. https://github.com/python/cpython

Historical Record

PEP 468 updated the variadic variable for named values such that the author can retrieve the syntactic order in which the values were given. The rationale given for this change is that some users are developing APIs where order matters, such as serialization (PEPs) peps/pep-0468.rst
Python Enhancement Proposals (PEPs). Commit: fca6000dfd8b55127c9a1fcbac809ddb27aafab7. Python Software foundation. https://github.com/python/peps
.

Guile

Guile provides the following features:

All code samples were compiled with the partial-evaluation optimization turned off. This is because partial evaluation frequently removes the function call itself as the return values can be calculated at compile-time.

Additionally, bytecode listings include comments which help explain what the instruction is doing. The meaning of the comment depends on the instruction. For example, the make-immediate bytecode instruction copies a literal static value onto the stack. It contains a comment which indicates the value loaded:

			(make-immediate 4 4002)         ;; 1000
			

While the call-scm<-scm-scm, which is pronounced "Call scheme from scheme scheme" to signify that it is calling a built-in function that returns a scheme value and accepts two scheme values as input, includes the name of the built-in function it is calling:

			(call-scm<-scm-scm 8 8 7 111)   ;; lookup-bound
			

Note that the alloc-frame instruction will contain a comment referring to the number of "slots" that the frame has; this refers to the size of the stack after the instruction finishes executing.

These comments are helpfully added by the Guile decompiler.

Background

Guile implements the Scheme programming language, which is a dialect of Lisp. Scheme was originally described in a 1975 paper for demonstrative purposes (r0rs) Page 1.
Scheme: an Interpreter for Extended Lambda Calculus. Massachusetts Institute of Technology - Artificial Intelligence Laboratory. https://standards.scheme.org/official/r0rs.pdf
. Interest in the language led to a series of revisions to the original description and, eventually, standardization. These papers are refered to as "Revised Reports on Scheme", abbreviated to "rNrs" where the N is replaced with a number representing the revision count. For example, the most recently published version of the standard is the 7th revision of the Scheme programming language so it is referred to as "r7rs".

Guile was originally implemented as an interpreter which worked with a literal representation of a program's text (guile-manual) Section 9.3.1 "Why a VM?"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
A virtual machine was added to Guile in the 2.0 release (2010) (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
(guile-source) NEWS lines 5068-5071
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
and was rewritten for the 2.2 release (2017) (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
Modern Guile implements a compiler which produces bytecode, an interpreter which executes bytecode, and an interpreter which directly executes program text (guile-source) modules/language/scheme/spec.scm lines 43-45
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
This paper will focus on compiled bytecode as this keeps the analysis consistent with other sections and is the typical way to execute Guile code* For example, running guile script-name will first compile the script then run the compiled file rather than running the script directly. .

Guile bytecode operates as a stack machine with 2 pointers into the stack: the frame pointer and the stack pointer. (guile-manual) Section 9.3.2 "VM Concepts"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. The frame pointer stores a location near* It is "near" the beginning rather than "at" the beginning because some metadata is stored before the frame pointer's location; this metadata will not be relevant to this paper. the beginning of the frame, where each frame represents a single function call. The stack pointer keeps track of the end of the stack, like a pointer to the end of an array (guile-manual) Section 9.3.3 "Stack Layout"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. When instructions take an index, they differ in whether they take in an index relative to the stack pointer or the frame pointer (guile-manual) Section 9.3.5 "Compiled Procedures are VM Programs"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. In prose, this paper will always reference indexes relative to the stack pointer for the sake of consistency; this has the effect that frames start at a higher index and end at a lower index. For example, in the simple call the frame pointer, associated with the beginning of the stack, is moved to index 5 while the stack pointer, associated with the end of the stack, is moved to index 2.

Guile produces different kinds of call instructions based on the call's location. Code samples in the repository were intentionally crafted to ensure that they always produced the plain call instruction; irrelevant parts of the code (such as a constant value placed after the call) are omitted from this paper.

The optargs module

Many of the features described in this paper are provided through Guile's optargs module, which was introduced in version 1.3.2 (1999) (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
(guile-source) NEWS lines 11451-11525
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
At this time, Guile lacked a virtual machine and the module was implemented in pure scheme. This implementation operated by adding a prelude to the function definiton which searches through the values provided by the caller to decide which values should be assigned to which variables (guile-1.3.2-source) ice-9/optargs.scm
Guile Source. 1.3.2. Commit: 0a852b9424f949575afecb19a391023acc63e635. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. The virtual machine was added to Guile in version 2.0.0 (2010) (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
(guile-source) NEWS lines 5068-5071
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. This also entailed a rewrite of the optargs module which centered the implementation around the internal <lambda-case> strucure. This structure contains all the information about the different kinds of arguments accepted by the function and facilitates the use of special-purpose VM instructions for processing arguments efficiently (guile-devel-archive) 2009-10 lines 6428-6453
Index of /archive/mbox/guile-devel. The GNU Project. https://lists.gnu.org/archive/mbox/guile-devel/
. While the VM has gone through changes since then, including a complete rewrite in version 2.2 (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
the implementation of the optargs module has remained stable. The special-purpose VM instructions have been adapted to handle the details of VM operation correctly and decrease latency, but the core logic used to process arguments has proven to be robust.

Simple Function and Call

			(define (add-values a b)
			  (+ a b))
			
			(add-values 1000 1001)
			

The bytecode generated for the function definition is fairly straightforward.

			0 (call-scm<-scm-scm 1 1 0 0)     ;; add
			1 (reset-frame 1)                 ;; 1 slot
			2 (handle-interrupts)
			3 (return-values)
			

Instruction 0 calls the built-in function add (which corresponds to the + function) and places the result into index 0 (guile-source) liguile/vm-engine.c lines 1545-1549
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git

Instruction 1 resizes the stack so that it contains 1 element - in this case, the return value (guile-source) libguile/vm-engine.c lines 797-802
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Instruction 2 ensures that Guile properly handles signals like Ctrl-C (SIGINT) and code instrumentation (guile-manual) Section 6.22.3 "Asynchronous Interrupts"
Section 9.3.7.6 "Instrumentation Instructions"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
it can be safely ignored here and it will not be mentioned again.

Finally, the return-values instruction moves the flow of execution back to the caller. Counterintutively, it does not actually handle moving the return values to specific locations; it only sets the frame and instruction pointers to the caller's values (guile-source) libguile/vm-engine.c lines 530-555
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

The bytecode generated for the call to add-values is more complicated because it is calling a user-defined function. It needs to load the function and its arguments* The function definition did not have to manage the arguments for the built-in call because built-in calls take values from arbitrary stack locations which are specified in the instruction arguments, so the caller-supplied locations were reusable. , then dispatch to the function.

			0 (static-ref 7 16324)            ;; add-values
			1 (call-scm<-scm-scm 8 8 7 111)   ;; lookup-bound
			2 (scm-ref/immediate 5 8 1)
			3 (make-immediate 4 4002)         ;; 1000
			4 (make-immediate 3 4006)         ;; 1001
			5 (handle-interrupts)
			6 (call 3 3)
			

Instruction 0 loads the literal string "add-values" into index 7 (guile-source) libguile/vm-engine.c line 2125-2129
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Instruction 1 looks up the value associated with the name add-values and stores it in index 8 (guile-source) libguile/vm-engine.c lines 1545-1549
libguile/intrinsics.c lines 374-37
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. The value includes both the procedure itself and some associated metadata; instruction 2 retrieves the procedure itself and stores it in index 5 (guile-source) libguile/vm-engine.c lines 1906-1910
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git

Instructions 3-4 place the constant values 1000 and 1001 into indices 4 and 3, respectively. At this point the most relevant portion of the stack looks like this:

Index Value
5 #<procedure add-values>
4 1000
3 1001

Finally, instruction 6 calls the function; internally, this involves moving the frame pointer to index 5 (where the procedure is stored) and the stack pointer to index 2 (the new end of the stack, which is smaller for the callee than for the caller) (guile-source) libguile/vm-engine.c lines 451-454
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Positional Value

Provided as the default mechanism for mapping values to arguments.

Syntax & Semantics

The syntax and semantics for both calls and definitions are identical to the simple call.

Implementation

Guile does not explicitly enforce a restriction that values can only be supplied positionally. This is because, technically, all values are always supplied positionally. When a caller provides named values they are actually providing pairs of values: keywords as names and arbitrary values as values. For example, given this call it is impossible to say whether the caller is supplying named values or positional values:

			(make-dictionary #:first  "one"
			                 #:second "two"
			                 #:third  "three")
			

This could be a valid call to a function with named values:

			(define* (make-dictionary #:key first second third)
			  (let ((result (make-hash-table)))
			    (hash-set! result #:first  first)
			    (hash-set! result #:second second)
			    (hash-set! result #:third  third)
			    result))
			

Or it could be a valid call to a function which takes 6 positional values:

			(define* (make-dictionary key0 val0
			                          key1 val1
			                          key2 val2)
			  (let ((result (make-hash-table)))
			    (hash-set! result key0 val0)
			    (hash-set! result key1 val1)
			    (hash-set! result key2 val2)
			    result))
			

From the perspective of this particular caller, it is impossible to determine which implementation is in use without inspecting the code* Of course, a caller who wanted to use different keywords in their dictionary would notice the difference between these implmentations very quickly! .

However, it is likely that using named values for a function which does not accept them will result in an error. This is because names are themselves values and are unlikely to be the correct kind of value for the function. For example, this attempt to use named values resulted in an error because the local variable a is bound to the value #:a, which is not a valid input to +:

			(define (add-values a b)
			  (+ a b))
			
			(add-values #:a 5 #:b 7)
			; ice-9/boot-9.scm:1685:16: In procedure raise-exception:
			; In procedure +: Wrong type argument in position 1: #:a
			
Historical Record

The Scheme standard requires that implementations accept value positionally. This has been true since the original paper describing lisp (original-lisp-paper) Pages 185-186
McCarthy, John. Recursive functions of symbolic expressions and their computation by machine, Part I. Association for Computing Machinery, 1980. https://doi.org/10.1145/367177.367199
. The stated motivation is to distinguish between "functions" and "forms". In that paper, a function is the abstract idea of a formula that can have concrete numbers applied to it. Traditionally, a form is syntactically identical to a formula, but it is meant to be a stand-in for whatever value the formula will resolve to in context. The differentiating syntax is taken from a previous work which had a similar concern and does not explicitly justify the postional notation* The decision to supply arguments positionally appears to derive from prior work describing multi-argument functions as a series of functions which take in one of the values and call the "next" function which will receieve the "next" value (this is similar to the concept of currying used in some programming languages). With this model, it is natural to write values positionally as they must be provided in the correct order to ensure that each partial function operates correctly. It's also worth noting that the examples of functions used in the paper are relatively simple - they are all inlinable with the prose. In the face of such simplicity, naming values would be verbose without much benefit. (calculi-of-lambda-conversion) Pages 3-7
Church, Alzono. The Calculi of Lambda-Conversion. Princeton University Press, 1941. https://doi.org/10.2307/2267126

While the semantics of positional arguments have remained the same, the implementation has changed significantly. In the oldest version of Guile available through source control, values are given to a function by collecting them into a list, then dispatching based on the number of values expected by the function (guile-first-commit) libguile/eval.c lines 1822-2004
Guile Source. Commit: 0f2d19dd46f83f41177f61d585732b32a866d613. The GNU Project. https://git.savannah.gnu.org/git/guile.git
With the addition of the VM, values are given by putting them into a frame managed by the VM (eg, not a C-level frame), avoiding the need to create a list unless the function is variadic.

Default Value

Provided through both #:optional and #:key. When considering default values, both of these are semantically equivalent. The difference between them is whether values are passed by position or by name. This section will exclusively use #:optional because none of the functions used in the examples are complicated enough to benefit from named values.

Syntax & Semantics

The default value for an argument can either be specified or unspecified. If it is unspecified it is #false. Arguments become associated with a default value by placing them after the #:optional keyword in a define* form. This verison of add-values demonstrates use of default values without specification.

			(define* (add-values #:optional a b)
			  (+ (or a 0) (or b 0)))
			

This function adds the given values, and the body of the function uses the value 0 if the caller omitted a vaule.

However, it is idiomatic to specify a default value in the function signature rather than testing the truthiness of the value. This is done by specifying the argument with a list instead of a symbol. The first element is the symbol naming the local variable and the second element is the default value.

			(define* (add-values #:optional (a 0) (b 0))
			  (+ a b))
			

This version is semantically equivalent to the first.

Implementation

The bytecode for the function definition differs significantly. I will start with the bytecode for the function which includes default values in the signature as this version is simpler:

			   0    (bind-optionals 3)              ;; 2 args
			   1    (alloc-frame 3)                 ;; 3 slots
			   2    (immediate-tag=? 1 4095 2308)   ;; undefined?
			   3    (jne 2)                         ;; -> L1
			   4    (make-immediate 1 2)            ;; 0
			L1:
			   5    (immediate-tag=? 0 4095 2308)   ;; undefined?
			   6    (jne 2)                         ;; -> L2
			   7    (make-immediate 0 2)            ;; 0
			L2:
			   8    (call-scm<-scm-scm 2 1 0 0)     ;; add
			   9    (reset-frame 1)                 ;; 1 slot
			  10    (handle-interrupts)
			  11    (return-values)
			

Instruction 0, bind-optionals, checks if the caller omitted any argument values. If so, it fills in the associated variables with the undefined value (guile-source) libguile/vm-engine.c lines 3213-3233
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

The next 2 sections, instructions 2-4 and 5-7, check whether or not the local variables associated with optional values are undefined; if so, they are filled with the default value specified by the author.

Finally, instructions 8-11 are the body of the function as found in the simple call.

The version which does not specify default values in the signature is similar, but contains additional (and repetitious) logic.

			 0 (bind-optionals 3)              ;; 2 argss
			 1 (alloc-frame 3)                 ;; 3 slots
			 2 (immediate-tag=? 1 4095 2308)   ;; undefined?
			 3 (jne 2)                         ;; -> L1
			 4 (make-immediate 1 4)            ;; #f
			L1:
			 5 (immediate-tag=? 0 4095 2308)   ;; undefined?
			 6 (jne 2)                         ;; -> L2
			 7 (make-immediate 0 4)            ;; #f
			L2:
			 8 (immediate-tag=? 1 3839 4)      ;; false?
			 9 (jne 2)                         ;; -> L3
			10 (make-immediate 1 2)            ;; 0
			11 L3:
			12 (immediate-tag=? 0 3839 4)      ;; false?
			13 (jne 2)                         ;; -> L4
			14 (make-immediate 0 2)            ;; 0
			L4:
			15 (call-scm<-scm-scm 2 1 0 0)     ;; add
			16 (reset-frame 1)                 ;; 1 slot
			17 (handle-interrupts)             
			18 (return-values)                 
			

Instruction 0 is also bind-optionals.

Instructions 2-7 fill in the default value but in this case the value is #:false rather than 0.

The next two sections check the argument values a second time and replace the argument value with 0 if it was false (instructions 8-14) - these represent the or statements in the source code.

Instruction 15-18 are the body of the function as found in the simple call.

The bytecode for calling a function using default values does not differ from the bytecode for the simple call.

Historical Record

Default values were included in the original optargs implementation (guile-source) NEWS lines 11451-11525
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. The semantics at that time were nearly identical to the current semantics. The one notable difference is that, in the case where the author does not define a default value and the caller does not supply one, the variable was not bound to any value. This was changed as part of merging a branch that was mostly concerned with how modules are handled. I was unable to find any record about the reason why this change was made* Marius Vollmer, the developer who authored the commit, was very kind in responding to an email enquiring about this. Unfortunately it has been more than 2 decades since the change was made and it was not notable enough to warrant a place in permanent memory. (guile-optargs-unbound-to-false) Diff of ice-9/optargs.scm.
Guile Source. Commit: 296ff5e78b8322fe4bf00c5ec1497dc28da776b8. The GNU Project. https://git.savannah.gnu.org/git/guile.git
(guile-devel-archive) 2001-05 lines 11305-11313
Index of /archive/mbox/guile-devel. The GNU Project. https://lists.gnu.org/archive/mbox/guile-devel/
. This was included in the 1.8 release series (2006)* I infer that it was first added in 1.8 because commit 296ff5e78b8322fe4bf00c5ec1497dc28da776b8, which implements the change, is between commit 0f24e75b73b9c0c745971de639a53748a395a1cb, which bumps the numbers in GUILE-VERSION from 1.7 to 1.9 (guile uses odd numbers for development versions and even numbers for release versions) and commit c299f186ff9693fc88859daef037e3d94cc7c0ff, which adds content to the NEWS file regarding changes new in 1.6. However, I did not find any content in the NEWS file which discusses the change (including in more recent release notes). (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
.

Named Value

Provided through the #:key arguments to optargs definitions.

Syntax & Semantics

The caller must use the name specified by the author, which will also be the name used by the local variable* The caller uses a keyword to name the value while the author uses a symbol to name the variable, as normal, but they both use the same name. . When an author mandates a named value they also implicitly create a default value. Details about default values are described in the appropriate section; for the purpose of named values all we need to know is that if a caller omits a value then the local variable will be bound to #:false. For example, an author of add-values could allow named values like this:

			(define* (add-values #:key a b)
			  (+ (or a 0) (or b 0)))
			

This function could be called in any of the following ways, with the result shown in the proceeding comment:

			(add-values)             ; 0
			(add-values #:a 3)       ; 3
			(add-values #:b 3)       ; 3
			(add-values #:a 3 #:b 3) ; 6
			

However, these would not be legal:

			(add-values 3)
			(add-values 2 #:b 2)
			

When an author specifies that a value can be named this also means that it must be named, if it is provided at all.

Additionally, an author may specify that a caller can supply arbitrary named values which are not specified in the function signature (and not bound to a local variable, unless the function is also variadic). This is done by adding #:allow-other-keys* Ironically, while the optargs module does not support implicit values for user-defined code, #:allow-other-keys is itself an implicit value. . to the function signature. For example, a definition of add-values with this feature would allow all of the following calls to be legal, and they would all produce the value 6.

			(define* (add-values #:key a b #:allow-other-keys)
			  (+ (or a 0) (or b 0)))
			
			(add-values #:a 3 #:b 3)
			(add-values #:a 3 #:b 3 #:c 34)
			(add-values #:c 34 #:a 3 #:b 3)
			(add-values #:a 3 #:b 3 #:multiplier 88)
			
Implementation

When Guile compiles a procedure it stores the list of valid argument names next to the SCM object representing the code (guile-source) libguile/vm-engine.c lines 734, 736, 743-745; libguile/vm.c lines 1003, 1008-1009
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
When a function using named values is defined, the compiler adds additional instructions as a prelude in the definition.

			   0    (bind-kwargs 1 0 1 3 16351)     
			   1    (alloc-frame 3)                 ;; 3 slots
			   2    (immediate-tag=? 1 4095 2308)   ;; undefined?
			   3    (jne 2)                         ;; -> L1
			   4    (make-immediate 1 14)           ;; 3
			L1:
			   5    (immediate-tag=? 0 4095 2308)   ;; undefined?
			   6    (jne 2)                         ;; -> L2
			   7    (make-immediate 0 22)           ;; 5
			L2:
			   8    (call-scm<-scm-scm 2 1 0 0)  ;; add
			   9    (reset-frame 1)                 ;; 1 slot
			  10    (handle-interrupts)             
			  11    (return-values)                 
			

Instruction 0, bind-kwargs, performs the bulk of the work required to process named values at runtime. First, it initializes all relevant local variables (eg, ones that have default values or are named) to the undefined value (guile-source) libguile/vm.c lines 995-998
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. Next, it walks through the list of non-positional arguments with the assumption that every even-indexed item is a keyword naming an argument and binds the odd-indexed items to the appropriate variables* Technically, it assumes that alternating value are keywords, except that it will silently ignore a non-keyword if #:rest is included in the procedure definition. However, I am currently focused on named values in isolation of other features. #:rest is discussed in the section on variadic functions. . (guile-source) libguile/vm.c lines 1000-1037
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Instruction 1 prepares the stack for the function call.

Instructions 2-7 ensure that the local variables which come from named values are initialized properly; these instructions are actually about default values, not named values, so they are discussed in the appropriate section.

Finally, instructions 8-11 are the body of the function as found in the simple definition.

The bytecode for the call is largely similar to the simple call, except that it also sends the keywords as additional argument values to the function (while they are not considered values within the body of a define* form, they are values at this lower level):

			 1 (call-scm<-thread 8 62)         ;; current-module
			 2 (static-ref 7 16324)            ;; add-values
			 3 (call-scm<-scm-scm 8 8 7 111)   ;; lookup-bound
			 4 (scm-ref/immediate 5 8 1)
			 5 (static-ref 4 16331)            ;; #:a
			 6 (make-immediate 3 34)           ;; 8
			 7 (static-ref 2 16340)            ;; #:b
			 8 (make-immediate 1 38)           ;; 9
			 9 (handle-interrupts)
			10 (call 3 5)
			
Historical Record

Named values were introduced with the original implementation of the optargs module (guile-source) NEWS lines 11451-11525
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. In the original implementation, the let-ok-template helper function generated a scheme-level let (or let*) form which initially binds each variable name to either the default value supplied by the author or a fresh undefined variable (guile-initial-optargs) module/ice-9/optargs.scm lines 128-135
Guile Source. Commit: 7e01997e88c54216678271de36b1c2088377492d. The GNU Project. https://git.savannah.gnu.org/git/guile.git
The bindfilter local in the let-keyword-template helper replaced these initial values with the values supplied by the caller (guile-initial-optargs) module/ice-9/optargs.scm lines 152-168
Guile Source. Commit: 7e01997e88c54216678271de36b1c2088377492d. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

The current implementation is largely the same, except that it takes advantage of being able to directly manipulate the VM state from C code instead of using macros to mutate code syntactically. The maintainers decided to move the implementation from scheme to C in order to decrease the latency associated with a function call that uses named values (guile-devel-archive) 2009-10 lines 10053-10056, 14008-14009, 14139-14141
Index of /archive/mbox/guile-devel. The GNU Project. https://lists.gnu.org/archive/mbox/guile-devel/
.

One change, which may or may not be considered a bug fix, has to do with the way that Guile processes named values when the function also contains a variadic variable. Originally, Guile required that all named values precede all (other) variadic values. This was changed so that variadic values may be intermingled with named values, so long as variadic values are not keywords (unless #:allow-other-keys is set, in which case they may be keywords) (guile-source) Commit message, complete diff.
Guile Source. Commit: ff74e44ecba55f50b2c2c84bad2f13bed9489455. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Repeating Value

Guile uses the syntactically latest value given for an argument as the actual value for the argument.
Syntax & Semantics
A repeating value is given by providing the name multiple times with different values:
			(add-values #:a 1000 #:b 1001 #:a 1002) ;; 2003
			
Implementation
When Guile binds values to variables, it simply walks through the list of values provided by the caller, checks for keywords that match argument names, and fills in the variable with the following value when a match is found (guile-source) libguile/vm-engine.c lines 734, 736, 743-745; libguile/vm.c lines 1003, 1008-1009
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. This algorithm implicitly causes the overwriting behavior without any explicit check for repeated values.
Historical Record
The original implementation of the optargs module use let* for creating bindings (guile-initial-optargs) module/ice-9/optargs.scm line 261
module/ice-9/optargs.scm line 113
module/ice-9/optargs.scm line 156
module/ice-9/optargs.scm line 124
Guile Source. Commit: 7e01997e88c54216678271de36b1c2088377492d. The GNU Project. https://git.savannah.gnu.org/git/guile.git
, so it would have also exhibited the overwriting behavior.

Implicit Value

This feature is not provided by Guile.

Typed Value

This feature is provided by defining methods in the Guile Object-Oriented Programming System (GOOPS).

Syntax & Semantics

In GOOPS methods are not contained within classes. Instead, they are simply functions with type specifications. When an author defines a method, they can choose to specify a type by providing a two-element list in place of an argument name (guile-manual) Section 8.6 "Methods and Generic Functons"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. The first element of the list is the argument name and the second element is the class name. For example:

			(define-method (add-values (a <number>) (b <number>))
			  (+ a b))
			

This defines a method that will only be executed when it is called with exactly two arguments which are both numbers. The author can also define alternative implementations for different types* While overloading is generally out of scope for this paper, the implementation section will not make sense without mentioning it. .

			(define-method (add-values (a <list>) (b <list>))
			  (append a b))
			
Implementation

Defining a method defines two separate functions: a generic which is responsible for method dispatching and the method which contains the body that the author defined (guile-manual) Section 8.6 "Methods and Generic Functons"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. A generic is an applicable struct (a structure which can be called like a function) which stores all methods which share a name (guile-source) module/oop/goops.scm lines 2045-2245
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
When a caller uses that name with some arguments, the generic object searches all of the methods associated with that name for a signature with the best fit (guile-source) module/oop/goops.scm lines 1381-1449
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. This indirectly causes an error if the caller provides a set of values which do not have a compatible type: the searching process will fail to find a match and print a message like this one:

			ice-9/boot-9.scm:1685:16: In procedure raise-exception:
			No applicable method for #< add-values (2)> in call (add-values foo bar)
			

The bytecode for calling a function with typed values is identical to the simple call.

Historical Record

GOOPS was first added to the Guile repository in version 1.6 (guile-manual) Section 9.1.4 "A Timeline of Selected Guile Releases"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. Its implementation was based on STklos, the object system for STK, and it was also influenced by CLOS, the object system for Common Lisp (guile-source) module/oop/goops.scm lines 24-25
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
(guile-manual) Section 8.0 "GOOPS"
Guile Manual. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git Shell command: info guile
. The original code* I am tentatively using 4b5d86e0334f6b8c0b37c55cf47a4cd30e7803e0 as the commit which "adds GOOPS". This is somewhat fictitious; 9 commits prior (fdd70ea97c142dc8db1e3f147ac6f5bd6ae157c6) changed the major version "due to the merge of GOOPS". However, at the time this commit was made no GOOPS files had been added; the version number was adjusted in anticipation of the merge, not after the merge was complete. I am choosing the other commit as representing the initial add of GOOPS to the repository because it is the last consecutive commit which adds or modifies a GOOPS file after the version adjustment. . looks quite different from the modern version of the file due to changes in the core language, bug fixes, and refactorings for performance and readability. However, the underlying idea of storing all alternative implementations of a function in a structure that uses a type signature to implementation map has remained stable.

Caller Destructuring

Provided through the apply function. This allows a caller to provide a list which is destructured into a set of positional arguments.

Syntax & Semantics

Destructuring in guile has a simple interface, as apply is simply a function that takes in a function as the first value, then any number of arbitrary values, and requires a possibly empty list as the final value. For example, all of these are equivalent:

			(add-values 3 4)
			(apply add-values 3 '(4))
			(apply add-values '(3 4))
			

However, this would be an error because the final value is not a list:

			(apply add-values 3 4)
			
Implementation

apply operates by first collecting all of the positional arguments into a single list, with the final value as the tail of the list, and sending them to scm_call_n* A typical guile programmer (such as myself) who is familiar with the dotted list syntax or the #:rest argument (both described in the section on variadic functions) might be confused by the C-level API of the apply function. It takes 2 required arguments and a rest argument. The apply function takes any number of arguments and requires that the final argument is a list. Oddly, the body of the function simply calls cons* on the second required argument and the rest argument; one might expect that this would cause the user-provided list to be preserved instead of flattened. However, when a C-level function is registered with a rest argument in guile it receives an improper list of arguments, rather than the proper list provided when #:rest is given to define*. See the file examples/guile/c-gsubr-rest.scm (and its corresponding c file) in the paper's repository for a demonstration. The reason for this discrepency is not clear but the API of scm_apply makes sense now. (guile-source) libguile/eval.c lines 585-622, lines 715-729
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. This function is responsible for setting up the VM stack correctly (guile-source) libguile/vm.c lines 1542-1621
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. For example, these lines save the frame's metadata and add the arguments to the correct stack positions:

			SCM_FRAME_SET_VIRTUAL_RETURN_ADDRESS (call_fp, vp->ip);
			SCM_FRAME_SET_MACHINE_RETURN_ADDRESS (call_fp, 0);
			SCM_FRAME_SET_DYNAMIC_LINK (call_fp, return_fp);
			SCM_FRAME_LOCAL (call_fp, 0) = proc;
			for (i = 0; i < nargs; i++)
			  SCM_FRAME_LOCAL (call_fp, i + 1) = argv[i];
			

These tasks are handled by different components in the simple call. The frame's metadata is typically set by the call instruction , while the arguments are added to the stack by the by separate instructions such as make-immediate or scm-ref.

Historical Record

The notion of list destructuring with apply was included in the original paper defining the lisp programming language (original-lisp-paper) Pages 189-190
McCarthy, John. Recursive functions of symbolic expressions and their computation by machine, Part I. Association for Computing Machinery, 1980. https://doi.org/10.1145/367177.367199
. This original definition took exactly 2 parameters, a function and a list containing the arguments. r2rs extended this by allowing the caller to pass any number of values between the function and the list (r2rs) Page 57
Revised Revised Report on Scheme or An UnCommon Lisp. Massachusetts Institute of Technology - Artificial Intelligence Laboratory. https://standards.scheme.org/official/r2rs.pdf

In the original interpreter, Guile simply called the function after manually unpacking the arguments (guile-first-commit) libguile/eval.c lines 1869-1985
Guile Source. Commit: 0f2d19dd46f83f41177f61d585732b32a866d613. The GNU Project. https://git.savannah.gnu.org/git/guile.git
With the release of Guile 2.0, which added the virtual machine (guile-source) NEWS lines 5068-5071
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
the apply function delegated to scm_call_with_vm, a dedicated C-level helper function for calling scheme-level functions (guile-2.0-source) libguile/eval.c lines 782-802
Guile Source. 2.0. Commit: 958a28e9fec33ebb4673294308a82ccd18cc6071. The GNU Project. https://git.savannah.gnu.org/git/guile.git
. This was replaced with the scm_call_n function as part of a refactoring which removed explicit references to the VM from many locations in C code as well as eliminating VM visibility from scheme code (guile-use-scm-call-n) Commit message and diff of libguile/eval.scm at -486,41 +486,41
Guile Source. Commit: 55ee3607003702ef5c53994c6216b9f0f835e0f1. The GNU Project. https://git.savannah.gnu.org/git/guile.git
The reason for this change is not explicitly stated in the commit messages but is likely related to performance improvements attributed to a rewrite of the virtual machine in the Guile 2.2 release notes (guile-source) NEWS lines 1995-2004, 2124-2205
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git

Author Destructuring

This feature is not provided by Guile.

Variadic Function

Provided by the "dotted list" syntax and the #:rest keyword in a define*.

Syntax & Semantics

The #:rest syntax allows an author to define a variadic variable that will contain a (possibly empty) list of variadic values. The variadic variable must be declared after all other declarations. For example, this definition and call would have the following value* As the variadic variable contains a list, the below code uses apply to destructure the list. This is described in detail in the section on caller destructuring. :

			(define* (add-values a b #:rest variadic-variable)
				(apply + a b variadic-variable)
			
			(add-values 1000 2001 1002 2003)
			; 6006
			

The dotted list syntax is similar, except that the author uses a single period instead of the #:rest keyword and this syntax works without optargs:

			(define* (add-values a b . variadic-variable)
				(apply + a b variadic-variable)
			
			(add-values 1000 2001 1002 2003)
			; 6006
			

Other than this syntactic difference, the two methods of defining a variadic function are identical.

Implementation

For definitions, the only notable difference from the simple function is the addition of the bind-rest instruction near the beginning of the body. This instruction takes values from the stack and constructs a new list which contains all of them; this list is bound to the variadic variable (guile-source) libguile/vm-engine.c lines 756-783
libguile/vm.c lines 1041-1052
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

			static SCM
			cons_rest (scm_thread *thread, uint32_t base)
			{
			  SCM rest = SCM_EOL;
			  uint32_t n = frame_locals_count (thread) - base;
			
			  while (n--)
			    rest = scm_inline_cons (thread, SCM_FRAME_LOCAL (thread->vm.fp, base + n),
			                            rest);
			
			  return rest;
			}
			

For calls, the bytecode is identical to the simple call.

Historical Record

The dotted list syntax was first described in r2rs, with the same semantics that are in use today (r2rs) Page 13
Revised Revised Report on Scheme or An UnCommon Lisp. Massachusetts Institute of Technology - Artificial Intelligence Laboratory. https://standards.scheme.org/official/r2rs.pdf

The #:rest syntax was originally used by other lisp dialects and implementations of Scheme; Guile provided it in the original release of the optargs module in order to ease the transition for programmers used to those languages (guile-source) NEWS lines 11514-11518
Guile Source. 3.1.2. Commit: 3b76a30e3ca1f0b7ee7944836c2fc5660596b3bd. The GNU Project. https://git.savannah.gnu.org/git/guile.git

As described in the section on positional values, the oldest versions of Guile manually collected argument values into a Scheme-level list and dispatched based on the number of values given. This made it trivial for the interpreter to supply the variadic values by dropping elements from the front of the list of all arguments (guile-first-commit) libguile/eval.c lines 1652-1655
Guile Source. Commit: 0f2d19dd46f83f41177f61d585732b32a866d613. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

When the VM was added in version 2.0, the bind-rest instruction was included. It performs essentially the same work as the current implementation (guile-2-0-source) libguile/vm-i-system.c lines 718-732
Guile Source. 2.0. Commit: 958a28e9fec33ebb4673294308a82ccd18cc6071. The GNU Project. https://git.savannah.gnu.org/git/guile.git
.

Rust

Rust provides the following features:

TODO: Talk about faked default values and named values? https://www.thecodedmessage.com/posts/default-params/

Background

Rust has a compiler which produces native code. It does not have a stable ABI and creating one is not considered desirable by the Rust developers (#600) Issue #600. "Define a Rust ABI". https://github.com/rust-lang/rfcs/issues/600 . When targeting Linux, the compiler appears to produce code that follows the System V ABI.

This paper uses Intel-style assembly syntax, for ease of use with the Intel Software Developer Manual.

Simple Function and Call

			fn add_values(a: i32, b: i32) -> i32 {
			    a + b
			}
			
			fn main() {
			    add_values(1000, 1001);
			}
			

The assembly for the call to add_values does 2 things. First, it pushes the literal values 1000 and 1001 to the edi and esi registers, respectively. This follows the System V convention that the first 2 arguments are passed in these registers (the specification implicitly assumes that all arguments are positional) (SV-ABI) Page 27.
System V Application Binary Interface. 2025. Lu, H.J., et al. https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/10470667619/artifacts/file/x86-64-ABI/abi.pdf
. Then it calls the add_values function.

			mov    edi,0x3e8
			mov    esi,0x3e9
			call   7940 <_ZN6simple10add_values17h33b93751bca0b5fbE>
			

The body of add_values does a number of things. First it pushed the rax register onto the stack; this conventionally contains the return value of function calls (SV-ABI) Page 27.
System V Application Binary Interface. 2025. Lu, H.J., et al. https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/10470667619/artifacts/file/x86-64-ABI/abi.pdf
. Next, it adds the given arguments directly in the registers used to pass them and moves the result onto the stack using rsp, which conventionally contains the stack pointer (SV-ABI) Page 27.
System V Application Binary Interface. 2025. Lu, H.J., et al. https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/10470667619/artifacts/file/x86-64-ABI/abi.pdf
. The next two instructions, seto and jo, are used to handle overflows (in debug mode, Rust panics on an overflow). Assuming there is no overflow, the code then moves the result of the addition from the stack into the eax* Note that eax is a subset of rax, so this is setting the lower bits of the rax register; it is only setting the lower bits due to the size of the integer specified in the code. register. Finally, it moves the top of the stack into the rcx register* It is not clear why it is helpful to push into rcx; it is conventionally used as the 4th positional argument and is not referenced at all in the calling site. and returns to the calling site.

			0000000000007940 <_ZN6simple10add_values17h33b93751bca0b5fbE>:
			push   rax
			add    edi,esi
			mov    DWORD PTR [rsp+0x4],edi
			seto   al
			jo     7952 <_ZN6simple10add_values17h33b93751bca0b5fbE+0x12>
			mov    eax,DWORD PTR [rsp+0x4]
			pop    rcx
			ret    
			lea    rdi,[rip+0x4c5ef]        # 53f48 <_ZN3std3sys3pal4unix4args3imp15ARGV_INIT_ARRAY17hf0e4ba666b615240E+0x48>
			call   QWORD PTR [rip+0x4f061]        # 569c0 <_GLOBAL_OFFSET_TABLE_+0x1c0>
			nop
			

Positional Value

Provided as the only way to pass arguments.

Syntax & Semantics

Callers specify positional values by providing a comma-separated list of values. Authors specify local variable names with a corresponding list of comma-separated identifiers.

Implementation

The simple function and call use positional arguments; there are no additional features to examine here.

Historical Record

Rust used positional arguments since the pre-RFC era. Systems programming languages typically use positional arguments, as evidenced by the assumption in the System V ABI assumption that all arguments are positional.

Default Value

This feature is not provided by Rust.

Named Value

This feature is not provided by Rust.

Repeating Value

This feature is not supported by Rust. As it only allows passing values by position, it is not syntactically possible to supply multiple values for the same argument.

Implicit Value

This feature is not provided by Rust.

Typed Value

All function arguments in Rust are required to have an associated type.

Syntax & Semantics

Types are specified by placing a colon after the local variable name, then the type name:

			fn add_values(a: i32, b: i32) -> i32 {
			    a + b
			}
			

There are 2 reasons why the compiler might reject code due to the declared types. One reason is that the given values are explicitly incompatible. For example, if the user tries to pass literal strings when calling the above definition of add_values, the compiler will emit this error (rust-source) compiler/rustc_hir_typeck/src/fn_ctxt/check.rs lines 791-793, 748-753
Rust Source. 1.87.0. Commit: 17067e9ac6d7ecb70e50f92c1944e545188d2359. The Rust Foundation. https://github.com/rust-lang/rust
:

			error[E0308]: arguments to this function are incorrect
			 --> simple.rs:7:5
			  |
			7 |     add_values("1000", "1001");
			  |     ^^^^^^^^^^ ------  ------ expected `i32`, found `&str`
			  |                |
			  |                expected `i32`, found `&str`
			  |
			note: function defined here
			 --> simple.rs:1:4
			  |
			1 | fn add_values(a: i32, b: i32) -> i32 {
			  |    ^^^^^^^^^^ ------  ------
			
			error: aborting due to 1 previous error
			

The other reason for an error is that type inference fails. While local variables declared in the function signature must have an explicit type, local variables declared in the body of the function do not; in this case, there is still an associated type but this type is inferred by the compiler based on usage. However, the programmer might try to use the variables in such a way that no associated type will satisfy all of the usage constraints. For example, if we additionally define this function:

			fn add_unsigned_values(a: u32, b: u32) -> u32 {
			    a + b
			}
			

Then try to call both of them with the same variables:

			fn main() {
			    let a = 1000;
			    let b = 1001;
			    add_values(a, b);
			    add_unsigned_values(a, b);
			}
			

The compiler will first infer that a and b must be of type i32 based on the first call (rust-source) compiler/rustc_hir_typeck/src/fn_ctxt/check.rs lines 313-319
Rust Source. 1.87.0. Commit: 17067e9ac6d7ecb70e50f92c1944e545188d2359. The Rust Foundation. https://github.com/rust-lang/rust
, then finds an incompatibility in the second call (rust-source) compiler/rustc_hir_typeck/src/fn_ctxt/check.rs lines 791-793, 748-753
Rust Source. 1.87.0. Commit: 17067e9ac6d7ecb70e50f92c1944e545188d2359. The Rust Foundation. https://github.com/rust-lang/rust
:

			error[E0308]: arguments to this function are incorrect
			  --> typed-value-error.rs:13:5
			   |
			13 |     add_unsigned_values(a, b);
			   |     ^^^^^^^^^^^^^^^^^^^ -  - expected `u32`, found `i32`
			   |                         |
			   |                         expected `u32`, found `i32`
			   |
			note: function defined here
			  --> typed-value-error.rs:5:4
			   |
			5  | fn add_unsigned_values(a: u32, b: u32) -> u32 {
			   |    ^^^^^^^^^^^^^^^^^^^ ------  ------
			help: you can convert an `i32` to a `u32` and panic if the converted value doesn't fit
			   |
			13 |     add_unsigned_values(a.try_into().unwrap(), b);
			   |                          ++++++++++++++++++++
			help: you can convert an `i32` to a `u32` and panic if the converted value doesn't fit
			   |
			13 |     add_unsigned_values(a, b.try_into().unwrap());
			   |                             ++++++++++++++++++++
			
Implementation

The generated assembly accesses memory under the assumption that the given values and/or pointers contain the expected data layout. While this means that the generated assembly will be different if the types in use are changed, there is no type-checking logic to examine in the final output.

Historical Record

Rust has required types on local variables declared in function signatures since the pre-RFC era.

Caller Destructuring

This feature is not provided by Rust.

Author Destructing

In Rust both tuples and structs can be destructured* Technically enums and slices can be destructured in function signatures as well, but the pattern must be irrefutable - meaning that only enums with a single variant can be destructured, and slices can only be destructured if all values are ignored. by the author.

Syntax & Semantics

For both tuples and structs the author specifies a pattern in place of a variable name:

			fn add_values_tuple_destructured((a, b): &(i32, i32)) -> i32 {
			    a + b
			}
			
			struct Arguments {
			    a: i32,
			    b: i32,
			}
			
			fn add_values_struct_destructured(Arguments { a: a, b: b }: &Arguments) -> i32 {
			    a + b
			}
			

In each case a local variable is named based on the names in the left side of the colon. In the case of structs, the right side of the inner colons represents the name of the local variable while the left side represents the name of the member in the struct.

Additionally, the author can choose to ignore some members of an input when destructuring. They do this by specifying only the elements that they want to match and placing a literal .. to indicate that other elements should be ignored:

			fn add_values_tuple_too_many_destructured((a, b, ..): &(i32, i32, i32)) -> i32 {
			    a + b
			}
			
			struct TooManyArguments {
			    a: i32,
			    b: i32,
			    #[allow(dead_code)]
			    c: i32,
			}
			
			fn add_values_struct_too_many_destructured(TooManyArguments { a, b, .. }: &TooManyArguments) -> i32 {
			    a + b
			}
			
Implementation

Use of destructuring does not change how the argument is passed. In either case, a pointer to the given structure is passed using the expected register. For example, given the following function definitions:

			fn add_values_tuple(tuple: &(i32, i32)) -> i32 {
			    tuple.0 + tuple.1
			}
			
			fn add_values_tuple_destructured((a, b): &(i32, i32)) -> i32 {
			    a + b
			}
			
			fn add_values_struct(arguments: &Arguments) -> i32 {
			    arguments.a + arguments.b
			}
			
			fn add_values_struct_destructured(Arguments { a: a, b: b }: &Arguments) -> i32 {
			    a + b
			}
			

The calling site will always load the address of the structure into rdi:

			lea    rdi,[rip+0x3e600]        # 46018 <_fini+0x24c>
			call   7970 <_ZN20author_destructuring16add_values_tuple17h78fb3f52970da270E>
			lea    rdi,[rip+0x3e5f4]        # 46018 <_fini+0x24c>
			call   79a0 <_ZN20author_destructuring29add_values_tuple_destructured17h565795344f78b8c3E>
			lea    rdi,[rip+0x3e5e8]        # 46018 <_fini+0x24c>
			call   79c0 <_ZN20author_destructuring17add_values_struct17h7b66a2e7b4d693a8E>
			lea    rdi,[rip+0x3e5dc]        # 46018 <_fini+0x24c>
			call   79f0 <_ZN20author_destructuring30add_values_struct_destructured17hfb478771134896efE>
			

However, the bodies of the functions differ* While the bodies of the destructured functions differ from their non-destructured counterparts, the differences between a destructured tuple and destructured struct are negligible. when destructuring is used. When it is not used, the function looks similar to the simple function, except that it accumulates the values into eax by dereferencing the pointer in edi instead of using edi (and esi) directly:

			push   rax
			mov    eax,DWORD PTR [rdi]     <---- pointer dereferenced>
			add    eax,DWORD PTR [rdi+0x4] <---- dereferenced with offset, for "second" argument>
			mov    DWORD PTR [rsp+0x4],eax <---- eax stored on the stack>
			seto   al
			jo     7985 <_ZN20author_destructuring16add_values_tuple17h78fb3f52970da270E+0x15>
			mov    eax,DWORD PTR [rsp+0x4] <---- eax restored from stack>
			pop    rcx
			ret    
			lea    rdi,[rip+0x4c574]        # 53f00 <_ZN3std3sys3pal4unix4args3imp15ARGV_INIT_ARRAY17hf0e4ba666b615240E+0x48>
			call   QWORD PTR [rip+0x4f02e]        # 569c0 <_GLOBAL_OFFSET_TABLE_+0x1c0>
			cs nop WORD PTR [rax+rax*1+0x0]
			nop    DWORD PTR [rax+0x0]
			

However, when destructuring is used, a pointer to the second value is loaded into rsi and a helper function is called:

			push   rax
			mov    rsi,rdi <---- rsi initialized with rdi>
			add    rsi,0x4 <---- rsi offset by same offset as in the non-destructuring version>
			lea    rdx,[rip+0x4c569]        # 53f18 <_ZN3std3sys3pal4unix4args3imp15ARGV_INIT_ARRAY17hf0e4ba666b615240E+0x60>
			call   78d0    <---- helper function called> <_ZN49_$LT$$RF$i32$u20$as$u20$core..ops..arith..Add$GT$3add17h90bb72e1fc0da902E>
			pop    rcx
			ret    
			cs nop WORD PTR [rax+rax*1+0x0]
			

The helper function is similar to the accumulation performed in the non-destructured versions:

			mov    eax,DWORD PTR [rdi]      <---- eax initialized with first argument>
			add    eax,DWORD PTR [rsi]      <---- second argument accumulated into eax>
			mov    DWORD PTR [rsp+0x14],eax <---- eax stored on the stack>
			seto   al
			jo     78ef <_ZN49_$LT$$RF$i32$u20$as$u20$core..ops..arith..Add$GT$3add17h90bb72e1fc0da902E+0x1f>
			mov    eax,DWORD PTR [rsp+0x14] <---- eax restored from stack>
			add    rsp,0x18
			ret    
			

The discrepency is a reasonable literal translation of destructuring into assembly code: instead of dereferencing the pointer inline, the second part of the pattern is moved into the second positional register and a helper function which takes 2 (theoretically unrelated) positional arguments is called.

The assembly for the bodies of destructuring functions that ignore some elements is identical to the above assembly; ignored values are literally ignored.

Historical Record

Pattern-matching has been available in Rust since the pre-RFC era. However, it has evolved in the RFC era.

RFC 1492 allows the construct .. to appear alongside variable names in destructured tuples. This has the effect of ignoring unspecified members: previously, .. could only be used to ignore all members in a tuple (RFC 1492) text/1492-dotdot-in-patterns.md
Rust Requests For Comment (RFCs). Commit: ea1f59fbc782b0c4cab5b19a628d9aa9ad834a58. Rust Foundation. https://github.com/rust-lang/rfcs.git
. It was added for consistency with the similar feature usable in slices (subsets of sequences) * The utility of this feature was probably seen as obvious by the developers, due to their experience with the similar feature for slices. It makes code more terse and readable when the author does not need all of the tuple members. .

There have been other changes to patterns through the RFC process, however they are not relevant here due to the requirement that patterns in function signatures are irrefutable. For example, there have been changes to conditional guards on patterns which can be refuted when the condition evaluates to false.

Variadic Function

This feature is not provided by Rust.