The ability to actively control the shape of aerospace structures has initiated research regarding the use of Shape Memory Alloy actuators. These actuators can be used for morphing or shape change by controlling their temperature, which is effectively done by applying a voltage difference across their length. The ability to characterize this temperature-strain relationship using Reinforcement Learning has been previously accomplished, but in order to control Shape Memory Alloy wires it is more beneficial to learn the voltage-position relationship. Numerical simulation using Reinforcement Learning has been used for determining the temperature-strain relationship for characterizing the major and minor hysteresis loops, and determining a limited control policy relating applied temperature to desired strain. Since Reinforcement Learning creates a non-parametric control policy, and there is not currently a general parametric model for this control policy, determining the voltage-position relationship for a Shape Memory Alloy is done separately. This paper extends earlier numerical simulation results and experimental results in temperature-strain space by applying a similar Reinforcement Learning algorithm to voltage-position space using an experimental hardware apparatus. Results presented in the paper show the ability to converge on a near-optimal control policy for Shape Memory Alloy length control by means of an improved Reinforcement Learning algorithm. These results demonstrate the power of Reinforcement Learning as a method of constructing a policy capable of controlling Shape Memory Alloy wire length.